mmh3 (MurmurHash3)

raw JSON →
5.2.1 verified Tue May 12 auth: no python install: verified quickstart: verified

mmh3 is a Python extension for MurmurHash (MurmurHash3), a collection of fast and robust non-cryptographic hash functions invented by Austin Appleby. Currently at version 5.2.1, it is actively maintained with updates for new Python versions, performance enhancements, and extended platform support for applications in data mining, machine learning, and natural language processing.

pip install mmh3
error ModuleNotFoundError: No module named 'mmh3'
cause The mmh3 package is not installed in the Python environment being used or is not accessible in the current Python path.
fix
Install the package using pip: pip install mmh3
error AttributeError: module 'mmh3' has no attribute 'hash'
cause This error often occurs in specific environments like Bazel, or when there's a packaging issue that prevents the 'hash' function from being directly accessible via `mmh3.hash`, especially after version 4.0.0.
fix
Ensure mmh3 is correctly installed and its modules are properly resolved in your environment. For Bazel users, refer to specific Bazel integration solutions or ensure a compatible mmh3 version is used that works with your build setup. If this is a recent upgrade, rolling back to an earlier version (e.g., pip install mmh3==4.0.0) might temporarily resolve it, then check for environment-specific resolutions.
error error: Microsoft Visual C++ 14.0 or greater is required.
cause When installing mmh3 on Windows, a C++ compiler is needed to build the extension modules, and it is not found on the system.
fix
Install the 'Microsoft C++ Build Tools' which provides the necessary compiler. These can be downloaded from Visual Studio's website or by selecting 'Desktop development with C++' in the Visual Studio installer. After installation, try pip install mmh3 again.
breaking Support for Python 3.9 was dropped in version 5.2.1, and earlier versions dropped support for Python 3.7 (v4.0.0) and Python 3.6 (v3.1.0). Users on unsupported Python versions must either upgrade their Python environment or pin to an older `mmh3` version.
fix Upgrade to Python 3.10 or newer, or downgrade `mmh3` to a version compatible with your Python environment.
breaking Starting from version 4.0.0, mmh3 became endian-neutral, meaning its hash functions return the same values on big-endian platforms as on little-endian ones. This is a backward-incompatible change for users who relied on endian-sensitive behavior on big-endian systems.
fix If strict compatibility with original C++ MurmurHash3 on big-endian systems is required, use `mmh3` version 3.*. Otherwise, re-evaluate hash values for applications.
breaking In version 5.0.0, the `seed` argument for hash functions became strictly validated to ensure it falls within the unsigned 32-bit integer range `[0, 0xFFFFFFFF]`. Providing a negative or out-of-range seed now raises a `ValueError`.
fix Ensure all seed values passed to `mmh3` functions are non-negative and within the 32-bit unsigned integer range. Convert any negative seeds if they represent a valid unsigned equivalent.
deprecated The `hash_from_buffer()` function was deprecated in version 5.0.0. It is recommended to use the newer, more performant `mmh3_32_sintdigest()` or `mmh3_32_uintdigest()` functions as alternatives for hashing buffer objects.
fix Replace calls to `mmh3.hash_from_buffer()` with `mmh3.mmh3_32_sintdigest()` (for signed integer output) or `mmh3.mmh3_32_uintdigest()` (for unsigned integer output).
gotcha By default, `mmh3.hash()` and `mmh3.hash64()` return signed integer values, while `mmh3.hash128()` returns an unsigned integer. To obtain unsigned results for 32-bit or 64-bit hashes, you must explicitly pass `signed=False`. For 128-bit hashes, `signed=True` is needed for signed output.
fix Always use the `signed` keyword argument (e.g., `mmh3.hash('data', signed=False)`) to explicitly control the output type (signed or unsigned) and ensure consistent results across different hash functions or desired interpretations.
gotcha Version 5.2.0 introduced experimental support for Python 3.14t (no-GIL) wheels. However, thread safety for this no-GIL variant is not yet fully tested. Use with caution in multi-threaded environments.
fix If thread safety is critical in a no-GIL environment, perform thorough testing or consider using standard GIL-enabled Python distributions until thread safety is fully verified for the no-GIL `mmh3` builds.
gotcha The `mmh3.hash64()` function returns a tuple of two 64-bit integers, unlike `mmh3.hash()` which returns a single 32-bit integer, because it uses the 128-bit MurmurHash3 algorithm internally and splits the result. This can be a point of confusion if a single 64-bit integer is expected.
fix Be aware that `mmh3.hash64()` always returns a tuple. If a single 64-bit integer representation is needed, custom logic to combine or select from the tuple components will be required.
python os / libc status wheel install import disk
3.10 alpine (musl) wheel - 0.00s 18.1M
3.10 alpine (musl) - - 0.00s 18.1M
3.10 slim (glibc) wheel 1.7s 0.00s 19M
3.10 slim (glibc) - - 0.00s 19M
3.11 alpine (musl) wheel - 0.00s 20.0M
3.11 alpine (musl) - - 0.00s 20.0M
3.11 slim (glibc) wheel 1.7s 0.00s 21M
3.11 slim (glibc) - - 0.00s 21M
3.12 alpine (musl) wheel - 0.00s 11.8M
3.12 alpine (musl) - - 0.00s 11.8M
3.12 slim (glibc) wheel 1.6s 0.00s 12M
3.12 slim (glibc) - - 0.00s 12M
3.13 alpine (musl) wheel - 0.00s 11.6M
3.13 alpine (musl) - - 0.00s 11.5M
3.13 slim (glibc) wheel 1.5s 0.00s 12M
3.13 slim (glibc) - - 0.00s 12M
3.9 alpine (musl) wheel - 0.00s 17.6M
3.9 alpine (musl) - - 0.00s 17.6M
3.9 slim (glibc) wheel 2.1s 0.00s 18M
3.9 slim (glibc) - - 0.00s 18M

This quickstart demonstrates the most common `mmh3` hash functions: `hash()` for 32-bit results (signed or unsigned), `hash_bytes()` for 128-bit results as bytes, and `hash64()` for 64-bit results as a tuple of two integers. The `seed` and `signed` arguments are crucial for consistent results.

import mmh3

# Basic 32-bit hash, returns a signed integer
hash_value = mmh3.hash("foo")
print(f"32-bit signed hash of 'foo': {hash_value}")

# 32-bit hash with a seed
hash_with_seed = mmh3.hash("foo", seed=42)
print(f"32-bit signed hash of 'foo' with seed 42: {hash_with_seed}")

# 32-bit unsigned hash
unsigned_hash = mmh3.hash("foo", signed=False)
print(f"32-bit unsigned hash of 'foo': {unsigned_hash}")

# 128-bit hash as a string of bytes
hash_as_bytes = mmh3.hash_bytes("foo")
print(f"128-bit hash of 'foo' as bytes: {hash_as_bytes}")

# 64-bit hash (returns a tuple of two 64-bit signed integers)
hash64_result = mmh3.hash64("foo")
print(f"64-bit hash of 'foo': {hash64_result}")