ClickHouse Pool
clickhouse-pool is a Python library that provides a thread-safe connection pool for ClickHouse, built upon the `clickhouse-driver` library. It aims to efficiently manage and reuse connections to a ClickHouse server, reducing the overhead of establishing new connections for each query. The library is actively maintained, with its latest version being 0.6.1, and receives regular updates including bug fixes and dependency bumps.
Warnings
- breaking In version 0.4.0, the direct methods `get_conn()` and `put_conn()` on the pool were renamed to `pull()` and `push()` respectively. While the recommended approach is now `pool.get_client()` with a context manager, older code directly using these methods will break.
- breaking Version 0.6.0 introduced a breaking change by updating its Python requirement to Python 3.9 or newer. Installations on older Python versions (e.g., 3.8 or below) will fail or not be able to upgrade.
- gotcha Previous versions (prior to 0.5.3) had a connection pool bug that could lead to improper connection handling. Users on older versions might experience unexpected connection issues.
- gotcha The `ChPool` is configured with `connections_min` and `connections_max` parameters. If the number of concurrent client requests exceeds `connections_max`, a `ChPoolError.TooManyConnections` exception will be raised. This is intended behavior for a bounded pool but must be handled.
- gotcha Connections acquired from the pool via `get_client()` should ideally be managed within a `with` statement. If manually acquired (e.g., not using a context manager), ensure `pool.cleanup()` is called at the end of your application's lifecycle to properly close all connections and prevent resource leaks.
Install
-
pip install clickhouse-pool
Imports
- ChPool
from clickhouse_pool import ChPool
Quickstart
import os
from clickhouse_pool import ChPool
# Configure connection details. For production, use environment variables or a config file.
host = os.environ.get('CLICKHOUSE_HOST', 'localhost')
port = int(os.environ.get('CLICKHOUSE_PORT', 9000)) # Native protocol port
user = os.environ.get('CLICKHOUSE_USER', 'default')
password = os.environ.get('CLICKHOUSE_PASSWORD', '')
database = os.environ.get('CLICKHOUSE_DB', 'default')
# Initialize the connection pool
# connections_min and connections_max can be adjusted for your workload
pool = ChPool(
host=host,
port=port,
user=user,
password=password,
database=database,
connections_min=5,
connections_max=10
)
try:
with pool.get_client() as client:
# Execute a query
result = client.execute("SELECT number, 'hello' FROM system.numbers LIMIT 5")
print("Query Result:", result)
# Example of an insert
# client.execute("CREATE TABLE IF NOT EXISTS my_table (id UInt64, value String) ENGINE = Memory")
# client.execute("INSERT INTO my_table VALUES", [(1, 'test1'), (2, 'test2')])
# print("Data inserted.")
# result_insert = client.execute("SELECT * FROM my_table")
# print("Inserted Data:", result_insert)
except Exception as e:
print(f"An error occurred: {e}")
finally:
# Always close all connections in the pool once you're done with it
pool.cleanup()
print("Connection pool cleaned up.")