xxHash Python Binding
xxhash is a Python binding for the high-performance, non-cryptographic xxHash library by Yann Collet. It is widely favored for its speed and efficiency in applications requiring fast hashing of large volumes of data, such as data integrity checks, file deduplication, and checksum generation. The library is actively maintained, with version 3.6.0 currently available, and new releases occurring a few times a year, often to support newer Python versions and xxHash upstream updates.
Warnings
- breaking Python 3.6 support was dropped in `xxhash` v3.3.0. Users on Python 3.6 or older must use `xxhash` v3.2.0 or earlier.
- breaking `xxhash` v2.0.0 introduced XXH3 hashes (`xxh3_64`, `xxh3_128`) and required the underlying xxHash C library version 0.8.0 or newer.
- deprecated `xxhash.VERSION_TUPLE` has been deprecated in v3.6.0 and will be removed in the next major release.
- gotcha xxHash is a non-cryptographic hash function designed for speed and is NOT suitable for security-critical applications like password hashing, digital signatures, or HMAC where collision resistance is paramount.
- gotcha The `seed` argument for `xxh32` and `xxh64` functions expects an unsigned 32-bit or 64-bit integer, respectively. Providing a negative value or an overly large integer can lead to unexpected behavior.
- breaking As of `xxhash` v0.3.0, the `digest()` method returns bytes in big-endian representation of the integer hash. Prior versions returned little-endian, which is a breaking change for compatibility if upgrading from very old versions.
Install
-
pip install xxhash
Imports
- xxhash
import xxhash
- xxh64
xxhash.xxh64(data)
- xxh3_64
xxhash.xxh3_64(data)
- xxh3_128
xxhash.xxh3_128(data)
Quickstart
import xxhash
data_to_hash = b"Hello, xxHash! This is some data to be hashed."
# One-shot hashing
hash_value_64 = xxhash.xxh64(data_to_hash, seed=0).hexdigest()
hash_value_128 = xxhash.xxh3_128(data_to_hash).hexdigest()
print(f"XXH64 Hash: {hash_value_64}")
print(f"XXH3_128 Hash: {hash_value_128}")
# Incremental hashing
hasher = xxhash.xxh64(seed=42)
hasher.update(b"First chunk ")
hasher.update(b"of data.")
incremental_hash = hasher.hexdigest()
print(f"Incremental XXH64 Hash: {incremental_hash}")