{"id":4382,"library":"pybloom-live","title":"pybloom-live: Bloom Filter Implementation","description":"pybloom-live is a Python library providing an efficient implementation of the Bloom filter probabilistic data structure. It also offers a Scalable Bloom Filter, which can dynamically grow its capacity. Currently at version 4.0.0, it is a fork of the original `pybloom` project, with improvements like a consistent tightening ratio. It aims to provide fast, space-efficient membership testing for large datasets where a small probability of false positives is acceptable.","status":"active","version":"4.0.0","language":"en","source_language":"en","source_url":"https://github.com/joseph-fox/python-bloomfilter","tags":["data structure","bloom filter","probabilistic","membership testing","scalable"],"install":[{"cmd":"pip install pybloom-live","lang":"bash","label":"Install pybloom-live"}],"dependencies":[],"imports":[{"symbol":"BloomFilter","correct":"from pybloom_live import BloomFilter"},{"symbol":"ScalableBloomFilter","correct":"from pybloom_live import ScalableBloomFilter"}],"quickstart":{"code":"from pybloom_live import BloomFilter\n\n# Initialize a Bloom filter with a capacity of 1000 elements\n# and an acceptable false positive rate of 0.01 (1%)\nbloom = BloomFilter(capacity=1000, error_rate=0.01)\n\n# Add elements\nbloom.add(\"apple\")\nbloom.add(\"banana\")\nbloom.add(\"orange\")\n\n# Check for membership\nprint(f\"Is 'apple' in the filter? {'apple' in bloom}\") # Expected: True\nprint(f\"Is 'grape' in the filter? {'grape' in bloom}\") # Expected: False\n\n# Note: Due to the probabilistic nature, 'grape' *could* theoretically\n# return True with a small probability (false positive), but never False\n# if it was actually added.","lang":"python","description":"This example demonstrates how to initialize a basic `BloomFilter`, add elements, and check for their probable membership. You define the expected capacity and acceptable error rate during initialization."},"warnings":[{"fix":"Upgrade to Python 2.7+ or Python 3.x; or pin `pybloom-live<3.0.0` if Python 2.6 is unavoidable.","message":"Version 3.0.0 dropped support for Python 2.6. Users on older Python 2.x versions might need to use a prior version of pybloom-live or migrate their Python environment.","severity":"breaking","affected_versions":"<3.0.0"},{"fix":"Understand the implications of false positives for your application. If zero false positives are required, a different data structure (e.g., a hash set) is necessary, often at the cost of higher memory usage.","message":"Bloom filters are probabilistic data structures that can produce 'false positives'. This means they might indicate an element is present when it's not, but they will never produce 'false negatives' (they won't say an element is absent if it was actually added).","severity":"gotcha","affected_versions":"All"},{"fix":"Accurately estimate your maximum set size for `BloomFilter`. If the set size is unpredictable or grows over time, consider using `ScalableBloomFilter` which dynamically adjusts its size.","message":"The `BloomFilter` (non-scalable) requires you to pre-define an estimated `capacity` and `error_rate`. If the actual number of elements significantly exceeds the `capacity`, the false positive rate will increase drastically beyond the specified `error_rate`.","severity":"gotcha","affected_versions":"All"},{"fix":"Verify your import statements use `from pybloom_live import ...`. Check your `pip freeze` output to confirm `pybloom-live` is installed and not an older `pybloom` library.","message":"`pybloom-live` is a fork of the original `pybloom` library. While `pybloom-live` is actively maintained and has improvements, ensure you are importing from `pybloom_live` (e.g., `from pybloom_live import BloomFilter`) to use the correct version and features. Accidental imports from an older `pybloom` might lead to unexpected behavior or missing features.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}