mmh3 (MurmurHash3)
mmh3 is a Python extension for MurmurHash (MurmurHash3), a collection of fast and robust non-cryptographic hash functions invented by Austin Appleby. Currently at version 5.2.1, it is actively maintained with updates for new Python versions, performance enhancements, and extended platform support for applications in data mining, machine learning, and natural language processing.
Warnings
- breaking Support for Python 3.9 was dropped in version 5.2.1, and earlier versions dropped support for Python 3.7 (v4.0.0) and Python 3.6 (v3.1.0). Users on unsupported Python versions must either upgrade their Python environment or pin to an older `mmh3` version.
- breaking Starting from version 4.0.0, mmh3 became endian-neutral, meaning its hash functions return the same values on big-endian platforms as on little-endian ones. This is a backward-incompatible change for users who relied on endian-sensitive behavior on big-endian systems.
- breaking In version 5.0.0, the `seed` argument for hash functions became strictly validated to ensure it falls within the unsigned 32-bit integer range `[0, 0xFFFFFFFF]`. Providing a negative or out-of-range seed now raises a `ValueError`.
- deprecated The `hash_from_buffer()` function was deprecated in version 5.0.0. It is recommended to use the newer, more performant `mmh3_32_sintdigest()` or `mmh3_32_uintdigest()` functions as alternatives for hashing buffer objects.
- gotcha By default, `mmh3.hash()` and `mmh3.hash64()` return signed integer values, while `mmh3.hash128()` returns an unsigned integer. To obtain unsigned results for 32-bit or 64-bit hashes, you must explicitly pass `signed=False`. For 128-bit hashes, `signed=True` is needed for signed output.
- gotcha Version 5.2.0 introduced experimental support for Python 3.14t (no-GIL) wheels. However, thread safety for this no-GIL variant is not yet fully tested. Use with caution in multi-threaded environments.
- gotcha The `mmh3.hash64()` function returns a tuple of two 64-bit integers, unlike `mmh3.hash()` which returns a single 32-bit integer, because it uses the 128-bit MurmurHash3 algorithm internally and splits the result. This can be a point of confusion if a single 64-bit integer is expected.
Install
-
pip install mmh3
Imports
- mmh3
import mmh3
Quickstart
import mmh3
# Basic 32-bit hash, returns a signed integer
hash_value = mmh3.hash("foo")
print(f"32-bit signed hash of 'foo': {hash_value}")
# 32-bit hash with a seed
hash_with_seed = mmh3.hash("foo", seed=42)
print(f"32-bit signed hash of 'foo' with seed 42: {hash_with_seed}")
# 32-bit unsigned hash
unsigned_hash = mmh3.hash("foo", signed=False)
print(f"32-bit unsigned hash of 'foo': {unsigned_hash}")
# 128-bit hash as a string of bytes
hash_as_bytes = mmh3.hash_bytes("foo")
print(f"128-bit hash of 'foo' as bytes: {hash_as_bytes}")
# 64-bit hash (returns a tuple of two 64-bit signed integers)
hash64_result = mmh3.hash64("foo")
print(f"64-bit hash of 'foo': {hash64_result}")