fastdigest

0.12.0 · active · verified Thu Apr 16

Fastdigest is a Python library that provides a lightning-fast implementation of the t-digest data structure, built on Rust. It offers a lightweight suite of online statistics for streaming and distributed data, enabling accurate estimation of quantiles, CDF, trimmed mean, and more. The library is currently at version 0.12.0 and maintains an active release cadence.

Common errors

Warnings

Install

Imports

Quickstart

Initialize a TDigest, update it with streaming or batch data, and query for quantiles, median, and other statistics.

from fastdigest import TDigest
import random

# Create a new TDigest instance
digest = TDigest()

# Add values incrementally
for _ in range(10000):
    digest.update(random.random() * 100)

# Or create from a sequence of values (with optional weights)
data = [1.42, 2.71, 3.14, 5.0, 8.0, 13.0]
digest_from_values = TDigest.from_values(data)

# Add a batch of values
digest_from_values.batch_update([1.0, 2.0, 3.0, 4.0])

# Get quantiles
median = digest.quantile(0.5)
percentile_90 = digest.quantile(0.9)
print(f"Median: {median:.2f}")
print(f"90th Percentile: {percentile_90:.2f}")

# Get the number of centroids
print(f"Number of centroids: {len(digest)}")

# Check if empty (as method)
print(f"Is digest empty? {digest.is_empty()}")

view raw JSON →