{"id":7049,"library":"bloomfilter-py","title":"Bloomfilter-py","description":"Bloomfilter-py is a Python library providing a Bloom filter implementation, notable for its compatibility with Java's Guava library's serialization format. It allows for seamless reading and writing of Bloom filters between Python and Java applications. The current version is 1.1.0, and it has a moderate release cadence, with its latest update in August 2024.","status":"active","version":"1.1.0","language":"en","source_language":"en","source_url":"https://github.com/OldPanda/bloomfilter-py.git","tags":["bloom filter","data structure","probabilistic","guava","java compatibility","set membership"],"install":[{"cmd":"pip install bloomfilter-py","lang":"bash","label":"Install stable version"}],"dependencies":[],"imports":[{"symbol":"BloomFilter","correct":"from bloomfilter import BloomFilter"}],"quickstart":{"code":"from bloomfilter import BloomFilter\n\n# Initialize a Bloom filter with expected insertions and desired error rate\nbloom_filter = BloomFilter(expected_insertions=1000, err_rate=0.01)\n\n# Add elements\nbloom_filter.put(\"apple\")\nbloom_filter.put(\"banana\")\nbloom_filter.put(\"orange\")\n\n# Check for membership\nprint(f\"Is 'apple' in filter? { 'apple' in bloom_filter }\")\nprint(f\"Is 'grape' in filter? { 'grape' in bloom_filter }\")\n\n# Serialize to bytes (Guava compatible)\nserialized_data = bloom_filter.dumps()\n\n# Deserialize from bytes\nloaded_filter = BloomFilter.loads(serialized_data)\nprint(f\"Is 'banana' in loaded filter? { 'banana' in loaded_filter }\")","lang":"python","description":"Demonstrates initializing a Bloom filter, adding elements, checking for membership, and performing Guava-compatible serialization/deserialization."},"warnings":[{"fix":"Tune `expected_insertions` and `err_rate` during initialization to match your application's requirements for memory and accuracy. Keep in mind that a lower `err_rate` or higher `expected_insertions` will increase memory usage.","message":"Bloom filters inherently have a false positive rate, meaning `element in bloom_filter` might return True for elements not actually added. There are no false negatives. The `err_rate` parameter controls this trade-off.","severity":"gotcha","affected_versions":"All versions"},{"fix":"If element deletion is a requirement, consider using a 'Counting Bloom Filter' (not provided by this library) or rebuilding the filter periodically with only the active elements.","message":"Standard Bloom filters, including this implementation, do not support deletion of elements. Attempting to remove an element would compromise the integrity of other stored elements by clearing shared bits.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure that any external systems you interact with are also using or compatible with Java Guava's Bloom filter serialization. If not, custom interoperability or a different Bloom filter library may be required.","message":"This library's primary feature is compatibility with Java's Guava Bloom filter serialization format. Its `dumps()` and `loads()` methods are specifically designed for this. It is unlikely to be compatible with Bloom filters serialized by other Python libraries or non-Guava Java implementations.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Monitor the number of insertions. If you anticipate exceeding the initial `expected_insertions`, consider re-initializing a new, larger Bloom filter and migrating existing elements, or design your system to handle the increased false positive rate.","message":"Exceeding the `expected_insertions` provided during initialization will cause the false positive rate to increase dramatically beyond the specified `err_rate`.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Re-evaluate your expected number of insertions and your acceptable error rate. Initialize the `BloomFilter` with a larger `expected_insertions` or a smaller `err_rate`. Remember that increasing capacity or lowering error rate will consume more memory.","cause":"The Bloom filter was initialized with `expected_insertions` too low or `err_rate` too high for the actual number of items inserted, or the `expected_insertions` limit was exceeded.","error":"False positives are too high, or bloom filter seems to always return True."},{"fix":"The correct import path is `from bloomfilter import BloomFilter`.","cause":"Incorrect import statement for the BloomFilter class.","error":"AttributeError: module 'bloomfilter' has no attribute 'BloomFilter'"},{"fix":"Verify that both the sender and receiver of the Bloom filter data are using or are compatible with Java Guava's Bloom filter serialization format. If not, you will need to use a different Bloom filter library that provides a common serialization format, or implement custom conversion logic.","cause":"This library is specifically designed for compatibility with Java's Guava Bloom filter serialization format. Other implementations may use different hashing functions, bit array structures, or serialization schemes.","error":"Data deserialized from another Bloom filter implementation (e.g., Python `pybloom` or a custom C++ implementation) is not recognized or yields incorrect results when loaded by `bloomfilter-py`, or vice-versa."}]}