{"id":8986,"library":"fastdigest","title":"fastdigest","description":"Fastdigest is a Python library that provides a lightning-fast implementation of the t-digest data structure, built on Rust. It offers a lightweight suite of online statistics for streaming and distributed data, enabling accurate estimation of quantiles, CDF, trimmed mean, and more. The library is currently at version 0.12.0 and maintains an active release cadence.","status":"active","version":"0.12.0","language":"en","source_language":"en","source_url":"https://github.com/moritzmucha/fastdigest","tags":["data-structures","statistics","t-digest","quantiles","rust","streaming-data","percentiles","online-algorithms"],"install":[{"cmd":"pip install fastdigest","lang":"bash","label":"Install stable version from PyPI"}],"dependencies":[{"reason":"Optional dependency for efficient weighted updates and array-based initialization methods like `TDigest.from_values` with `w` argument.","package":"numpy","optional":true}],"imports":[{"symbol":"TDigest","correct":"from fastdigest import TDigest"}],"quickstart":{"code":"from fastdigest import TDigest\nimport random\n\n# Create a new TDigest instance\ndigest = TDigest()\n\n# Add values incrementally\nfor _ in range(10000):\n    digest.update(random.random() * 100)\n\n# Or create from a sequence of values (with optional weights)\ndata = [1.42, 2.71, 3.14, 5.0, 8.0, 13.0]\ndigest_from_values = TDigest.from_values(data)\n\n# Add a batch of values\ndigest_from_values.batch_update([1.0, 2.0, 3.0, 4.0])\n\n# Get quantiles\nmedian = digest.quantile(0.5)\npercentile_90 = digest.quantile(0.9)\nprint(f\"Median: {median:.2f}\")\nprint(f\"90th Percentile: {percentile_90:.2f}\")\n\n# Get the number of centroids\nprint(f\"Number of centroids: {len(digest)}\")\n\n# Check if empty (as method)\nprint(f\"Is digest empty? {digest.is_empty()}\")","lang":"python","description":"Initialize a TDigest, update it with streaming or batch data, and query for quantiles, median, and other statistics."},"warnings":[{"fix":"Change calls from `digest.mass` to `digest.mass()` and `digest.is_empty` to `digest.is_empty()`.","message":"The `mass` and `is_empty` attributes were converted from properties to methods. Attempting to access them as properties will now raise an `AttributeError`.","severity":"breaking","affected_versions":">=0.12.0"},{"fix":"Ensure `max_centroids` is always a non-negative integer when initializing `TDigest` or modifying its property.","message":"The `max_centroids` setter and constructor arguments changed their internal type handling and validation. Attempting to set negative `max_centroids` will now raise a `ValueError` (previously could cause an `OverflowError` or unexpected behavior).","severity":"breaking","affected_versions":">=0.9.0"},{"fix":"Be aware that `std()` results may be different and more accurate compared to versions prior to 0.12.0, especially for non-normal data distributions.","message":"The `std()` method's estimation was significantly improved in v0.12.0. It now estimates population variance via centroid second moments, making it more accurate and faster than the previous MAD-based estimation (from v0.11.0), which was only strictly valid for approximately normal distributions. Results for non-normal distributions will differ.","severity":"gotcha","affected_versions":">=0.12.0"},{"fix":"Always ensure that all elements within the iterable passed to `merge_all` are valid `TDigest` instances.","message":"Calling `merge_all` with an iterable containing non-`TDigest` objects will now explicitly raise a `TypeError` instead of potentially panicking or leading to unexpected behavior.","severity":"gotcha","affected_versions":">=0.8.1"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Call `mass` as a method: `digest.mass()`","cause":"In versions 0.12.0 and later, `mass` was changed from a property to a method.","error":"AttributeError: 'TDigest' object has no attribute 'mass'"},{"fix":"Call `is_empty` as a method: `digest.is_empty()`","cause":"In versions 0.12.0 and later, `is_empty` was changed from a property to a method.","error":"AttributeError: 'TDigest' object has no attribute 'is_empty'"},{"fix":"Ensure that `max_centroids` is always a non-negative integer value, e.g., `TDigest(max_centroids=50)`.","cause":"Attempting to initialize `TDigest` or set `max_centroids` with a negative value or non-integer type.","error":"ValueError: max_centroids must be a non-negative integer"},{"fix":"Verify that all elements in the list or iterable passed to `TDigest.merge_all()` are indeed `TDigest` objects.","cause":"The `merge_all` method received an iterable containing objects that are not `TDigest` instances.","error":"TypeError: object of type 'int' has no len() (or similar when merging non-TDigest objects)"},{"fix":"Consider reducing the `max_centroids` parameter to limit memory usage, process data in smaller batches, or ensure sufficient memory is available for large operations.","cause":"The `TDigest` attempted to allocate memory for a large number of centroids, exceeding available system memory. This can happen with very large merges or when `max_centroids` is set too high for the data volume.","error":"MemoryError: failed to allocate TDigest (possibly during merge operation)"}]}