{"id":7775,"library":"tdigest","title":"T-Digest data structure","description":"The `tdigest` library is a Python implementation of Ted Dunning's t-digest data structure, designed for efficient and accurate percentile and quantile estimation from streaming or distributed data. It enables computations like percentiles, quantiles, and trimmed means. The current official PyPI version is 0.5.2.2, with releases focusing on performance improvements and bug fixes. The library is actively maintained with occasional updates.","status":"active","version":"0.5.2.2","language":"en","source_language":"en","source_url":"https://github.com/CamDavidsonPilon/tdigest","tags":["data-structure","streaming","quantiles","percentiles","statistics","approximation","distributed-computing","big-data"],"install":[{"cmd":"pip install tdigest","lang":"bash","label":"Install latest PyPI version"}],"dependencies":[{"reason":"Commonly used for generating sample data in examples and for batch updates.","package":"numpy","optional":true}],"imports":[{"symbol":"TDigest","correct":"from tdigest import TDigest"}],"quickstart":{"code":"import numpy as np\nfrom tdigest import TDigest\n\n# Create a TDigest instance\ndigest = TDigest()\n\n# Update the digest sequentially with random data\nfor _ in range(5000):\n    digest.update(np.random.random())\n\n# Or update the digest in batches\nanother_digest = TDigest()\nanother_digest.batch_update(np.random.random(5000))\n\n# Compute the 15th percentile\nprint(f\"15th percentile (sequential): {digest.percentile(15)}\")\nprint(f\"15th percentile (batch): {another_digest.percentile(15)}\")\n\n# Sum two digests\nsum_digest = digest + another_digest\nprint(f\"30th percentile (summed): {sum_digest.percentile(30)}\")","lang":"python","description":"Initializes a TDigest object, updates it with data either sequentially or in batches, and demonstrates how to compute percentiles and merge two digests. Requires `numpy` for random data generation."},"warnings":[{"fix":"Replace `digest.quantile(x)` with `digest.cdf(x)`.","message":"The `quantile` method was renamed to `cdf` in version 0.5.0 for more accurate terminology. Calling `quantile()` on versions 0.5.0 and later will result in an AttributeError.","severity":"breaking","affected_versions":">=0.5.0"},{"fix":"For the latest version, consider `pip install git+https://github.com/CamDavidsonPilon/tdigest.git` (use with caution, may not be production-ready stable).","message":"The latest PyPI version (0.5.2.2) lags behind the latest GitHub release (v0.6.0.1). Users seeking the absolute latest features or bug fixes might need to install directly from GitHub, though this is not typically recommended for production.","severity":"gotcha","affected_versions":"All"},{"fix":"Ensure your project uses Python 3.6+ to align with modern Python practices and better library support.","message":"Older documentation and PyPI metadata mention compatibility with Python 2. Given Python 2's end-of-life, new development and most recent versions of `tdigest` are exclusively targeting Python 3. Relying on Python 2 compatibility is strongly discouraged and likely to lead to issues.","severity":"gotcha","affected_versions":"<0.5.0 (for implied Python 2 support), all (for outdated docs)"},{"fix":"Prefer `to_dict()` and `update_from_dict()` for robust serialization, especially when saving digests for later use or cross-version compatibility.","message":"While `tdigest` objects can be serialized to and from Python dictionaries using `to_dict()` and `update_from_dict()`, direct pickling might not always be forward/backward compatible across minor versions due to internal changes in the underlying data structure (`accumulation_tree` in v0.5.0).","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Use the `cdf` method instead: `digest.cdf(x)`.","cause":"Attempting to call the `quantile` method on a `TDigest` object in version 0.5.0 or later, after it was renamed.","error":"AttributeError: 'TDigest' object has no attribute 'quantile'"},{"fix":"Upgrade `pip` to the latest version (`pip install --upgrade pip`) and ensure your Python environment supports modern TLS protocols. Consider upgrading your Python version if it's very old.","cause":"This error can occur with outdated `pip` versions or Python environments (often Python 2 or older Python 3.x) attempting to access PyPI mirrors with modern SSL/TLS configurations. This specific error was reported when installing `tdigest` indirectly pulling `cython`.","error":"Download error on https://pypi.org/simple/cython/: [SSL: TLSV1_ALERT_PROTOCOL_VERSION]"},{"fix":"Ensure all `TDigest` objects involved in merge operations (`+` operator or `merge` method) are properly initialized and contain data. For example, `TDigest()` instead of `None`.","cause":"This typically happens when trying to merge a `TDigest` object with a variable that is `None`, often due to an uninitialized digest or a failed previous operation.","error":"TypeError: unsupported operand type(s) for +: 'TDigest' and 'NoneType'"}]}