{"id":5149,"library":"cityhash","title":"CityHash and FarmHash Python Bindings","description":"CityHash is a family of fast non-cryptographic hash functions for strings, originally developed by Google. FarmHash is a successor designed for improved performance and collision resistance on modern CPUs. This library, `python-cityhash`, provides Python bindings for both CityHash and FarmHash, enabling high-performance hashing in Python applications. It is currently at version 0.4.10 and receives updates for Python version compatibility and Cython build fixes.","status":"active","version":"0.4.10","language":"en","source_language":"en","source_url":"https://github.com/escherba/python-cityhash","tags":["hashing","checksum","non-cryptographic","performance","cityhash","farmhash"],"install":[{"cmd":"pip install cityhash","lang":"bash","label":"Install from PyPI"}],"dependencies":[],"imports":[{"symbol":"CityHash32","correct":"from cityhash import CityHash32"},{"symbol":"CityHash64","correct":"from cityhash import CityHash64"},{"symbol":"CityHash128","correct":"from cityhash import CityHash128"},{"symbol":"FarmHash32","correct":"from farmhash import FarmHash32"},{"symbol":"FarmHash64","correct":"from farmhash import FarmHash64"},{"symbol":"FarmHash128","correct":"from farmhash import FarmHash128"},{"symbol":"Fingerprint128","correct":"from farmhash import Fingerprint128"},{"symbol":"CityHashCrc128","correct":"from cityhash.cityhashcrc import CityHashCrc128"},{"symbol":"CityHashCrc256","correct":"from cityhash.cityhashcrc import CityHashCrc256"}],"quickstart":{"code":"import cityhash\nimport farmhash\n\n# Hashing a string (must be encoded to bytes)\ndata_string = \"hello world\"\nhashed_bytes = cityhash.CityHash64(data_string.encode('utf-8'))\nprint(f\"CityHash64 of '{data_string}': {hashed_bytes}\")\n\n# Hashing bytes directly\ndata_bytes = b\"another example\"\nhashed_bytes_128 = cityhash.CityHash128(data_bytes)\nprint(f\"CityHash128 of '{data_bytes.decode()}': {hashed_bytes_128}\")\n\n# Hashing with FarmHash\nfarm_hash_64 = farmhash.FarmHash64(b\"farmhash test\")\nprint(f\"FarmHash64 of 'farmhash test': {farm_hash_64}\")\n\n# Hashing an integer (must be converted to fixed-size bytes)\nint_data = 123456789\nhashed_int = cityhash.CityHash64(int_data.to_bytes(8, 'big'))\nprint(f\"CityHash64 of integer {int_data}: {hashed_int}\")","lang":"python","description":"This example demonstrates how to use CityHash and FarmHash functions. It highlights the necessity of encoding strings to bytes and converting integers to a fixed-size byte representation for consistent hashing."},"warnings":[{"fix":"Upgrade to Python 3 or pin the library version to <0.4.0 if Python 2 compatibility is essential.","message":"Version 0.4.0 dropped support for Python 2. Projects targeting Python 2 must use an older version of the library (e.g., <0.4.0).","severity":"breaking","affected_versions":">=0.4.0"},{"fix":"Ensure all string inputs are explicitly encoded to bytes using a consistent encoding (e.g., `my_string.encode('utf-8')`).","message":"CityHash and FarmHash functions operate on byte strings, not Python unicode strings. Attempting to hash a string directly will result in a TypeError or incorrect results. Always encode strings to bytes (e.g., `my_string.encode('utf-8')`) before passing them to hashing functions.","severity":"gotcha","affected_versions":"All"},{"fix":"Convert integers to a fixed-size byte sequence: `integer_value.to_bytes(8, 'big')` (or 'little' depending on desired endianness).","message":"When hashing integers, convert them to a fixed-size byte representation (e.g., 8 bytes for CityHash64) for consistent and reproducible results. Variable-length byte representations can lead to inconsistent hashes.","severity":"gotcha","affected_versions":"All"},{"fix":"For stream processing or incremental hashing needs, use a library that explicitly supports this feature (e.g., MetroHash or xxHash).","message":"This implementation of CityHash and FarmHash does not support incremental hashing. They are not suitable for hashing long character streams or data that arrives in chunks. For incremental hashing, consider libraries like MetroHash or xxHash.","severity":"gotcha","affected_versions":"All"},{"fix":"For cryptographic security requirements, use standard library `hashlib` functions (e.g., SHA256, Blake2b).","message":"CityHash and FarmHash are *non-cryptographic* hash functions. They are optimized for speed and good distribution, but they are not designed to be collision-resistant and should NOT be used for security-sensitive applications like password storage or digital signatures.","severity":"gotcha","affected_versions":"All"},{"fix":"Use `numpy.ascontiguousarray()` to convert non-contiguous arrays before hashing: `cityhash.CityHash64(numpy.ascontiguousarray(arr))`.","message":"When hashing NumPy arrays or other objects exposing the Python Buffer Protocol, ensure the array is contiguous in memory. Non-contiguous arrays might lead to unexpected results.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-13T00:00:00.000Z","next_check":"2026-07-12T00:00:00.000Z"}