ClickHouse CityHash Bindings

1.0.2.5 · active · verified Thu Apr 16

clickhouse-cityhash provides Python bindings for a specific, older version of Google's CityHash algorithm (v1.0.2). This library is primarily used to ensure compatibility with ClickHouse servers, which internally use this particular CityHash version for various hashing operations, including data in its protocol. It is a fork of the broader `python-cityhash` library, specifically tailored for the ClickHouse ecosystem. The current version is 1.0.2.5, and it receives updates for compatibility and bug fixes.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to import and use the `CityHash64` and `CityHash128` functions. It highlights the crucial step of encoding Python strings to bytes before hashing, as CityHash operates on byte strings. It also shows how to hash integers consistently by converting them to a fixed-size byte representation.

from cityhash import CityHash64, CityHash128

data_string = 'hello world'
data_bytes = data_string.encode('utf-8')

hash64 = CityHash64(data_bytes)
hash128 = CityHash128(data_bytes)

print(f"CityHash64 for '{data_string}': {hash64}")
print(f"CityHash128 for '{data_string}': {hash128}")

# Hashing an integer (must be converted to bytes for consistent results)
integer_data = 123456789
integer_bytes = integer_data.to_bytes(8, 'big') # 8 bytes for CityHash64
hash64_int = CityHash64(integer_bytes)
print(f"CityHash64 for integer {integer_data}: {hash64_int}")

view raw JSON →