pybloom-live: Bloom Filter Implementation

4.0.0 · active · verified Sun Apr 12

pybloom-live is a Python library providing an efficient implementation of the Bloom filter probabilistic data structure. It also offers a Scalable Bloom Filter, which can dynamically grow its capacity. Currently at version 4.0.0, it is a fork of the original `pybloom` project, with improvements like a consistent tightening ratio. It aims to provide fast, space-efficient membership testing for large datasets where a small probability of false positives is acceptable.

Warnings

Install

Imports

Quickstart

This example demonstrates how to initialize a basic `BloomFilter`, add elements, and check for their probable membership. You define the expected capacity and acceptable error rate during initialization.

from pybloom_live import BloomFilter

# Initialize a Bloom filter with a capacity of 1000 elements
# and an acceptable false positive rate of 0.01 (1%)
bloom = BloomFilter(capacity=1000, error_rate=0.01)

# Add elements
bloom.add("apple")
bloom.add("banana")
bloom.add("orange")

# Check for membership
print(f"Is 'apple' in the filter? {'apple' in bloom}") # Expected: True
print(f"Is 'grape' in the filter? {'grape' in bloom}") # Expected: False

# Note: Due to the probabilistic nature, 'grape' *could* theoretically
# return True with a small probability (false positive), but never False
# if it was actually added.

view raw JSON →