ClickHouse Connect
ClickHouse Connect is the official Python driver for ClickHouse, providing a high-performance core database interface for Python applications, Pandas DataFrames, NumPy arrays, PyArrow tables, Polars DataFrames, and Apache Superset integration. It leverages the ClickHouse HTTP interface for maximum compatibility and is actively maintained with regular updates.
Warnings
- breaking The parameter `apply_server_timezone` in client and query methods was renamed to `tz_source` in v0.14.0.
- breaking Version 0.13.0 introduced a native write path for the `Variant` data type. Previously, values were stringified; now they are serialized using their native ClickHouse types client-side, which changes how `Variant` columns store data.
- breaking The legacy executor-based asynchronous client (`AsyncClient(client=...)`) and related parameters (`executor_threads`, `executor`, `pool_mgr`) have been removed. The native aiohttp-based async client is now standard.
- deprecated Python 3.9 support is deprecated and will be removed in version 1.0. Python 3.8 is End-of-Life (EOL) and no longer officially tested or supported; wheels are not built for 3.8 AARCH64 versions.
- gotcha Optional dependencies (e.g., `numpy`, `pandas`, `pyarrow`, `polars`, `sqlalchemy`) are lazy-loaded. If you intend to use features relying on these, you must install them explicitly using the `[extra]` syntax (e.g., `pip install clickhouse-connect[pandas]`).
- gotcha For ClickHouse server versions 22.8 and 22.10+, there is an internal serialization format incompatibility for experimental JSON. Using multiple clients with mixed 22.8/22.9 and 22.10+ server versions will break if JSON support is enabled. Pandas 1.x support is also deprecated and will be dropped in 1.0.
- gotcha When creating a DBAPI Connection or SQLAlchemy DSN, unrecognized keyword arguments or query parameters will now raise an exception instead of being passed as ClickHouse server settings. Server settings should be prefixed with `ch_`.
Install
-
pip install clickhouse-connect -
pip install "clickhouse-connect[pandas,numpy,sqlalchemy,polars,arrow,async]"
Imports
- get_client
import clickhouse_connect client = clickhouse_connect.get_client(...)
Quickstart
import clickhouse_connect
import os
host = os.environ.get('CH_HOST', 'localhost')
port = int(os.environ.get('CH_PORT', 8123)) # Use 8443 for TLS/Cloud
username = os.environ.get('CH_USER', 'default')
password = os.environ.get('CH_PASSWORD', '')
database = os.environ.get('CH_DB', 'default')
try:
client = clickhouse_connect.get_client(
host=host,
port=port,
username=username,
password=password,
database=database,
secure= (port == 8443) # Automatically use TLS for 8443
)
# Test connection
client.ping()
print(f"Successfully connected to ClickHouse at {host}:{port}")
# Create a table
client.command(
"CREATE TABLE IF NOT EXISTS my_test_table (",
" id UInt64,",
" name String,",
" value Float64",
") ENGINE MergeTree ORDER BY id"
)
print("Table 'my_test_table' created or already exists.")
# Insert data
data_to_insert = [
[1, 'Alpha', 100.1],
[2, 'Beta', 200.2],
[3, 'Gamma', 300.3]
]
client.insert('my_test_table', data_to_insert, column_names=['id', 'name', 'value'])
print("Data inserted into 'my_test_table'.")
# Query data
result = client.query('SELECT * FROM my_test_table ORDER BY id')
print("Query Results:")
for row in result.result_set:
print(row)
# Example with Pandas (requires 'pandas' extra)
try:
import pandas as pd
df = client.query_df('SELECT * FROM my_test_table')
print("\nPandas DataFrame Results:")
print(df)
except ImportError:
print("\nSkipping Pandas example: 'pandas' not installed. Install with `pip install clickhouse-connect[pandas]`")
finally:
if 'client' in locals():
client.close()
print("Connection closed.")
except Exception as e:
print(f"An error occurred: {e}")