Cython Hash Table for Pre-Hashed Keys

3.0.13 · active · verified Sun Mar 29

preshed is a high-performance Cython library for Python that provides efficient hash table data structures. It's designed for use cases where keys are already pre-hashed, offering `PreshMap` for key-value storage, `PreshCounter` for frequency counting, and `BloomFilter` for probabilistic set membership testing. Maintained by Explosion (the creators of spaCy), it sees regular updates primarily for Python version compatibility and performance enhancements, with occasional major releases introducing significant architectural changes.

Warnings

Install

Imports

Quickstart

Demonstrates the basic usage of PreshMap, including initialization, setting and getting items, membership testing, and deletion. Keys are expected to be 64-bit unsigned integers.

from preshed.maps import PreshMap

# PreshMap expects uint64 keys and values
my_map = PreshMap(initial_size=1024) # Initial size should be a power of 2

# Simulate pre-hashed keys (e.g., using murmurhash)
key1 = 1234567890123456789 # Example uint64
key2 = 9876543210987654321

my_map[key1] = 100
my_map[key2] = 200

print(f"Value for key1: {my_map[key1]}") # Expected: 100
print(f"Value for key2: {my_map[key2]}") # Expected: 200
print(f"Is key1 in map: {key1 in my_map}") # Expected: True

# Test a missing key
missing_key = 1111111111111111111
print(f"Value for missing_key: {my_map[missing_key]}") # Expected: None

# Remove a key
del my_map[key1]
print(f"Is key1 in map after deletion: {key1 in my_map}") # Expected: False

view raw JSON →