ClickHouse Migrations
clickhouse-migrations is a Python library designed for managing file-based database migrations in ClickHouse. It supports multi-statement SQL files, cluster deployments, and ensures migration state consistency across nodes. The project is actively maintained, with the current version being 0.9.1, and receives regular updates.
Common errors
-
Dirty database version X. Fix and force version. Applying clickhouse migrations failed. This is mostly caused by the database being unavailable. Exiting..
cause A previous migration failed and left the internal `_migrations` table in a 'dirty' state, preventing further migrations.fixFirst, investigate the failed migration (version X) to determine its actual state. If it was partially applied, manually complete or revert changes. Then, update the `_migrations` table to mark version X as clean (e.g., `INSERT INTO _migrations (version, dirty) VALUES (X, 0);`) or delete its entry if not applied. Backup your data before manual modifications. -
Error: code: 62, message: Syntax error (Multi-statements are not allowed): failed at position X (end of query) (line Y, col Z): ; select 2;
cause ClickHouse client (or the library's internal client) received multiple SQL statements in a single query without the multi-statement mode enabled, or semicolons are misinterpreted.fixEnsure that if you are using multi-statement SQL files, you pass `multi_statement=True` when calling `cluster.migrate()`. If using the CLI, the tool attempts to split statements by semicolon, so avoid semicolons within string literals or comments in your SQL. -
error: failed to open database: code: 516, message: <db_name>: Authentication failed: password is incorrect, or there is no user with such name.
cause The provided ClickHouse password contains special characters that are not correctly parsed or URL-encoded by the underlying connection mechanism.fixIf the password contains special characters, ensure they are URL-encoded when provided via environment variables (`CH_DB_PASSWORD`) or directly in connection strings. For example, `P@ssword!` should be `P%40ssword%21`. -
DB::Exception: There is already a column with name 'new_column' in table 'my_table'
cause An `ALTER TABLE ... ADD COLUMN` statement was executed on a ClickHouse cluster where the column already existed on some nodes, leading to a duplicate column error.fixModify your migration SQL to include `IF NOT EXISTS` when adding columns, especially in clustered environments: `ALTER TABLE my_table ON CLUSTER default ADD COLUMN IF NOT EXISTS new_column String;`.
Warnings
- breaking Python 3.7 support was dropped in version 0.5.0. Ensure your environment uses Python 3.9 or higher.
- gotcha ClickHouse does not support transactions for DDL operations. If a migration fails mid-way, the schema may be left in a partially migrated and inconsistent state, which will not be automatically rolled back.
- gotcha When using multi-statement migration files, the library splits SQL statements by semicolons. This can lead to issues if semicolons are present within string literals or comments in your SQL queries.
- gotcha For ClickHouse cluster deployments, `ALTER TABLE ... ADD COLUMN` statements might fail on subsequent nodes if they don't include `IF NOT EXISTS`, leading to 'DUPLICATE_COLUMN' errors.
- gotcha ClickHouse passwords containing special characters (like `@`, `&`, `#`, `!`, `%`, `?`, `$`, `:`) may cause authentication failures when passed via CLI or environment variables if not properly URL-encoded.
Install
-
pip install clickhouse-migrations
Imports
- ClickhouseCluster
from clickhouse_migrations.clickhouse_cluster import ClickhouseCluster
Quickstart
import os
from clickhouse_migrations.clickhouse_cluster import ClickhouseCluster
db_host = os.environ.get('CH_DB_HOST', 'localhost')
db_user = os.environ.get('CH_DB_USER', 'default')
db_password = os.environ.get('CH_DB_PASSWORD', '')
db_name = os.environ.get('CH_DB_NAME', 'test_db')
migrations_home = os.environ.get('CH_MIGRATIONS_HOME', './migrations')
# Ensure migrations directory exists for the example
os.makedirs(migrations_home, exist_ok=True)
# Create a dummy migration file for the example
with open(f"{migrations_home}/1_create_test_table.sql", "w") as f:
f.write("CREATE TABLE IF NOT EXISTS my_table (id UInt64, name String) ENGINE = MergeTree() ORDER BY id;")
cluster = ClickhouseCluster(
db_host=db_host,
db_user=db_user,
db_password=db_password
)
try:
print(f"Applying migrations to {db_host}/{db_name} from {migrations_home}...")
cluster.migrate(
db_name=db_name,
migrations_home=migrations_home,
create_db_if_no_exists=True,
multi_statement=True
)
print("Migrations applied successfully.")
except Exception as e:
print(f"An error occurred during migration: {e}")