pyprobables

0.7.0 · active · verified Wed Apr 15

pyprobables is a pure-Python library offering implementations of common probabilistic data structures like Bloom filters, Count-Min sketches, Cuckoo filters, and Quotient filters. It provides memory-efficient ways to perform operations such as set membership testing and approximate frequency counting. The library is actively maintained, with its current version being 0.7.0, and receives regular updates including new features, bug fixes, and Python version support changes.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize a BloomFilter, add elements to it, and check for element membership. Bloom filters are used for approximate set membership testing, guaranteeing no false negatives but allowing for a configurable rate of false positives.

from probables import BloomFilter

# Initialize a Bloom filter for 100,000 elements with a 0.05 (5%) false positive rate
blm = BloomFilter(est_elements=100000, false_positive_rate=0.05)

# Add elements
blm.add('apple')
blm.add('banana')
blm.add('orange')

# Check for membership
print(f"Is 'apple' in the filter? {blm.check('apple')}")
print(f"Is 'grape' in the filter? {blm.check('grape')}")

# Demonstrate false positive possibility (very low with chosen parameters for this small example)
# In a real scenario, with many elements, a non-member might occasionally return True.
if blm.check('nonexistent_fruit'):
    print("Warning: A false positive occurred for 'nonexistent_fruit'.")

view raw JSON →