{"id":7658,"library":"rbloom","title":"rBloom","description":"rBloom is a highly optimized Bloom filter library for Python, implemented in Rust. It provides a fast, simple, and lightweight probabilistic data structure that closely mimics the Python built-in `set` API. Currently at version 1.5.4, it's designed for high-performance set membership testing with low memory footprint, and it sees regular updates, often driven by underlying PyO3 version enhancements.","status":"active","version":"1.5.4","language":"en","source_language":"en","source_url":"https://github.com/KenanHanke/rbloom","tags":["Bloom Filter","Rust","data structure","probabilistic","set-like API","performance"],"install":[{"cmd":"pip install rbloom","lang":"bash","label":"Install rBloom"}],"dependencies":[],"imports":[{"symbol":"Bloom","correct":"from rbloom import Bloom"}],"quickstart":{"code":"from rbloom import Bloom\n\n# Initialize a Bloom filter for 200 items with a 1% false positive rate\nbf = Bloom(200, 0.01)\n\n# Add items\nbf.add(\"hello\")\nbf.add(\"world\")\n\n# Check for membership\nprint(f\"'hello' in bf: {\"hello\" in bf}\")\nprint(f\"'python' in bf: {\"python\" in bf}\")\n\n# Update with multiple items\nbf.update([\"rust\", \"fast\"])\n\n# Set-like operations\nother_bf = Bloom(200, 0.01)\nother_bf.add(\"rust\")\nunion_bf = bf | other_bf # Union of filters\nprint(f\"'rust' in union_bf after union: {\"rust\" in union_bf}\")","lang":"python","description":"Initializes a Bloom filter with a specified capacity and false positive rate, demonstrates adding single and multiple elements, checking for membership, and performing a set-like union operation."},"warnings":[{"fix":"Provide a custom, deterministic hash function (e.g., using `hashlib`) when initializing the `Bloom` filter if you intend to serialize it or use it across multiple Python processes. Ensure the same hash function object is used for both saving and loading.","message":"When serializing `Bloom` filters (e.g., to bytes) or comparing them across different Python process invocations, you must provide a custom, stable hash function. Python's built-in `hash()` function's salt changes between invocations, leading to inconsistent hashes and incorrect `__contains__` or comparison results for deserialized or cross-process filters. The default `Bloom` filter without a custom hash function is only reliable within a single Python process where object hashes are consistent.","severity":"gotcha","affected_versions":"All versions (v1.5.0 onwards for serialization)"},{"fix":"Ensure all `Bloom` filters involved in set operations or comparisons are initialized with the same arguments (capacity, error_rate, and hash_func).","message":"For `Bloom` filter set operations (union `|`, intersection `&`, difference `-`, symmetric difference `^`) and comparisons (`issubset`, `issuperset`, `==`, `!=`), all participating filters must have identical parameters (capacity, false positive rate, and the exact same hash function object) to ensure correct behavior.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Install the Rust toolchain (Rustup recommended) and `maturin` (`pip install maturin`) if you need to build `rbloom` from source. Ensure your Rust toolchain is up-to-date.","message":"If a pre-built wheel is not available for your platform or Python version, `rbloom` will attempt to build from source. This requires a Rust toolchain to be installed, including `cargo` and `maturin`, which can be a dependency hurdle for some environments.","severity":"gotcha","affected_versions":"All versions for source builds"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure that all objects added to the `Bloom` filter are hashable (e.g., strings, numbers, tuples, immutable custom objects). Convert mutable objects to an immutable representation if necessary before adding them.","cause":"Attempting to add an unhashable Python object (like a list or dictionary) to the Bloom filter. Bloom filters, like Python sets, require elements to be hashable.","error":"TypeError: unhashable type: 'list'"},{"fix":"When creating the `Bloom` filter, provide a custom, deterministic hash function (e.g., using `hashlib.sha256` and serializing the object to bytes before hashing) to ensure consistent hashes for persistence and cross-process usage. Example: `bf = Bloom(capacity, error_rate, hash_func=my_stable_hash_function)`.","cause":"Using Python's default `hash()` function, which generates different hash values across Python process invocations due to a random salt. This breaks consistency for serialized filters or filters used in distributed systems.","error":"'item' in bf returns True when it shouldn't, or bf1 == bf2 returns False despite having the same elements, especially after loading from bytes or in another process."},{"fix":"Verify `rbloom` is installed with `pip show rbloom`. If not, run `pip install rbloom`. If issues persist, check your Python environment (virtual environment) or try reinstalling in a clean environment.","cause":"The `rbloom` package is not correctly installed, or there's a conflict with another package named `rbloom`.","error":"ImportError: cannot import name 'Bloom' from 'rbloom' (/path/to/rbloom/__init__.py)"}]}