{"id":1014,"library":"preshed","title":"Cython Hash Table for Pre-Hashed Keys","description":"preshed is a high-performance Cython library for Python that provides efficient hash table data structures. It's designed for use cases where keys are already pre-hashed, offering `PreshMap` for key-value storage, `PreshCounter` for frequency counting, and `BloomFilter` for probabilistic set membership testing. Maintained by Explosion (the creators of spaCy), it sees regular updates primarily for Python version compatibility and performance enhancements, with occasional major releases introducing significant architectural changes.","status":"active","version":"3.0.13","language":"python","source_language":"en","source_url":"https://github.com/explosion/preshed","tags":["Cython","hash table","performance","NLP","data structures","bloom filter","counter"],"install":[{"cmd":"pip install preshed --only-binary preshed","lang":"bash","label":"Pip Install (recommended for binary wheels)"},{"cmd":"conda install -c conda-forge preshed","lang":"bash","label":"Conda Install"}],"dependencies":[{"reason":"Required for memory management in Cython extensions; significant version bump (>=2.0.0) was a past breaking change.","package":"cymem","optional":false},{"reason":"Often used alongside preshed for pre-hashing keys, though not a strict direct runtime dependency of preshed itself in all contexts.","package":"murmurhash","optional":true}],"imports":[{"symbol":"PreshMap","correct":"from preshed.maps import PreshMap"},{"symbol":"BloomFilter","correct":"from preshed.bloom import BloomFilter"},{"symbol":"PreshCounter","correct":"from preshed.counter import PreshCounter"}],"quickstart":{"code":"from preshed.maps import PreshMap\n\n# PreshMap expects uint64 keys and values\nmy_map = PreshMap(initial_size=1024) # Initial size should be a power of 2\n\n# Simulate pre-hashed keys (e.g., using murmurhash)\nkey1 = 1234567890123456789 # Example uint64\nkey2 = 9876543210987654321\n\nmy_map[key1] = 100\nmy_map[key2] = 200\n\nprint(f\"Value for key1: {my_map[key1]}\") # Expected: 100\nprint(f\"Value for key2: {my_map[key2]}\") # Expected: 200\nprint(f\"Is key1 in map: {key1 in my_map}\") # Expected: True\n\n# Test a missing key\nmissing_key = 1111111111111111111\nprint(f\"Value for missing_key: {my_map[missing_key]}\") # Expected: None\n\n# Remove a key\ndel my_map[key1]\nprint(f\"Is key1 in map after deletion: {key1 in my_map}\") # Expected: False","lang":"python","description":"Demonstrates the basic usage of PreshMap, including initialization, setting and getting items, membership testing, and deletion. Keys are expected to be 64-bit unsigned integers."},"warnings":[{"fix":"Review your code for direct C API interactions or assumptions about internal memory management. Adapt to the new C++-backed structures. For Python users, this should mostly be an internal change, but retesting is recommended.","message":"Version 4.0.0 introduced significant internal architectural changes, replacing raw arrays and pointers with `std::vector` and `std::unique_ptr` for `BloomFilter`, `PreshMap`, and `PreshCounter` implementations, and removing `PreshMapArray`. This affects users interacting with the C API or relying on specific internal memory layouts.","severity":"breaking","affected_versions":">=4.0.0"},{"fix":"Ensure your project's `cymem` dependency is updated to `cymem>=2.0.0`. If you have other dependencies pinning an older `cymem`, you may need to update those packages or manage your dependency tree carefully.","message":"Version 2.0.0 introduced a hard dependency on `cymem>=2.0.0`. Projects using an older version of `cymem` (e.g., `cymem<2.0.0`) would face dependency conflicts.","severity":"breaking","affected_versions":">=2.0.0"},{"fix":"Always pre-hash your keys into `uint64` integers using a robust hashing algorithm (e.g., `murmurhash`) before using them with `preshed` data structures. The library assumes keys are already randomized.","message":"The library is explicitly designed for 'pre-hashed' keys (uint64_t values). Feeding non-hashed or poorly hashed data directly into `PreshMap` or `PreshCounter` without proper pre-hashing can lead to suboptimal performance and hash collisions, negating the library's benefits.","severity":"gotcha","affected_versions":"All"},{"fix":"For multithreaded Python applications, ensure you understand the thread-safety guarantees of each `preshed` class. Use external locking mechanisms (e.g., `threading.Lock`) for `PreshCounter` and direct C API calls in concurrent contexts.","message":"While Python APIs for `BloomFilter` and `PreshMap` are thread-safe on Python 3.14+ (including free-threaded builds), the C API and `PreshCounter` class require external synchronization if used in a multithreaded environment to prevent race conditions and data corruption.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-05-12T22:36:26.452Z","next_check":"2026-06-27T00:00:00.000Z","problems":[{"fix":"Upgrade or reinstall the 'preshed' library, preferably in a clean virtual environment. If it's a dependency, try upgrading the main package (e.g., `pip install --upgrade spacy preshed`).","cause":"The 'preshed' library or one of its specific submodules (like 'bloom') was not correctly installed, or there is a version incompatibility with a dependent library such as spaCy.","error":"ModuleNotFoundError: No module named 'preshed.bloom'"},{"fix":"Install the necessary C compiler and Python development headers for your system (e.g., `sudo apt-get install build-essential python3-dev` on Debian/Ubuntu, or `sudo yum install gcc python3-devel` on CentOS/RHEL).","cause":"preshed is a Cython library requiring a C compiler (like GCC) to build from source, which is needed if a pre-compiled wheel isn't available for your specific Python version and OS, or if Python development headers are missing.","error":"error: command 'x86_64-linux-gnu-gcc' failed with exit status 1"},{"fix":"Install the appropriate Microsoft Visual C++ Build Tools (available from the Visual Studio Community edition or as a standalone Build Tools installer) and ensure they are correctly configured for your Python installation.","cause":"On Windows, building Cython extensions like preshed requires the Microsoft Visual C++ Build Tools, which are not found in the system's PATH environment variable.","error":"DistutilsPlatformError: Unable to find vcvarsall.bat"},{"fix":"Ensure you have the necessary C/C++ compiler installed for your operating system (see compiler-specific fixes above). Also, update `pip` and `setuptools` (`pip install --upgrade pip setuptools`). Alternatively, consider using `conda install -c conda-forge preshed` which often provides pre-built binaries.","cause":"This general error occurs when pip attempts to build preshed from source (because a pre-compiled wheel is unavailable or incompatible), and the compilation process fails due to missing compiler tools, incompatible Python/Cython versions, or other build environment issues.","error":"Failed building wheel for preshed"}],"ecosystem":"pypi","meta_description":null,"install_score":100,"install_tag":"verified","quickstart_score":null,"quickstart_tag":null,"pypi_latest":"3.0.13","cli_name":"","install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","installed_version":"3.0.13","pypi_latest":"3.0.13","is_stale":false,"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"preshed","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0,"mem_mb":0.1,"disk_size":"24.4M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"preshed","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0,"mem_mb":0.1,"disk_size":"24.4M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"preshed","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":1.8,"import_time_s":0,"mem_mb":0.1,"disk_size":"22M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"preshed","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.1,"disk_size":"22M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"preshed","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0,"mem_mb":0.1,"disk_size":"26.4M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"preshed","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.1,"disk_size":"26.4M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"preshed","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":1.8,"import_time_s":0,"mem_mb":0.1,"disk_size":"24M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"preshed","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0,"mem_mb":0.1,"disk_size":"24M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"preshed","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0,"mem_mb":0.1,"disk_size":"18.4M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"preshed","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0,"mem_mb":0.1,"disk_size":"18.4M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"preshed","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":1.6,"import_time_s":0,"mem_mb":0.1,"disk_size":"16M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"preshed","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.1,"disk_size":"16M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"preshed","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0,"mem_mb":0.4,"disk_size":"18.1M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"preshed","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0,"mem_mb":0.4,"disk_size":"18.0M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"preshed","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":1.6,"import_time_s":0,"mem_mb":0.2,"disk_size":"16M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"preshed","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.2,"disk_size":"16M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"preshed","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0,"mem_mb":0.1,"disk_size":"23.9M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"preshed","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0,"mem_mb":0.1,"disk_size":"23.9M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"preshed","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":2,"import_time_s":0,"mem_mb":0.1,"disk_size":"22M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"preshed","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.1,"disk_size":"22M"}]},"quickstart_checks":{"last_tested":"2026-04-24","tag":null,"tag_description":null,"results":[{"runtime":"python:3.10-alpine","exit_code":0},{"runtime":"python:3.10-slim","exit_code":0},{"runtime":"python:3.11-alpine","exit_code":0},{"runtime":"python:3.11-slim","exit_code":0},{"runtime":"python:3.12-alpine","exit_code":0},{"runtime":"python:3.12-slim","exit_code":0},{"runtime":"python:3.13-alpine","exit_code":0},{"runtime":"python:3.13-slim","exit_code":0},{"runtime":"python:3.9-alpine","exit_code":0},{"runtime":"python:3.9-slim","exit_code":0}]}}