Bloom Filter 2

2.0.0 · active · verified Tue Apr 14

bloom-filter2 is a pure Python Bloom filter module, providing a space-efficient and probabilistic set data structure. It supports mmap, in-memory, and disk-seek backends, offering a balance between memory usage and performance. The library automatically calculates optimal Bloom filter parameters based on user-specified maximum elements and desired false positive rate. It is compatible with CPython 3.x, Pypy, and Jython and is actively maintained.

Warnings

Install

Imports

Quickstart

Initialize a BloomFilter, add elements, and check for membership using the `in` operator. The `max_elements` and `error_rate` parameters control the filter's capacity and false positive probability.

from bloom_filter2 import BloomFilter

# Instantiate BloomFilter with custom settings:
# max_elements is how many elements you expect the filter to hold.
# error_rate defines accuracy (false positive probability).
# You can use defaults with `BloomFilter()` without any arguments.
bloom = BloomFilter(max_elements=10000, error_rate=0.01)

# Test whether the bloom-filter has seen a key:
assert "test-key" not in bloom

# Mark the key as seen
bloom.add("test-key")

# Now check again
assert "test-key" in bloom

# Example with a different item (should be False initially)
assert "another-key" not in bloom

view raw JSON →