runstats
RunStats is an Apache2 licensed Python module for computing online statistics and linear regression in a single pass. It is designed for efficiently processing large data streams or generators where previous values are not retained, making it suitable for long-running systems. The library, currently at version 2.0.0, is actively maintained and provides numerically stable calculations for various statistical measures and regression coefficients. [1, 2, 3]
Common errors
-
ModuleNotFoundError: No module named 'runstats'
cause The `runstats` library has not been installed or is not accessible in the current Python environment.fixInstall the library using pip: `pip install runstats`. -
AttributeError: 'Statistics' object has no attribute 'slope'
cause Attempting to access a method or attribute specific to the `Regression` class (e.g., `slope`, `intercept`, `correlation`) on a `Statistics` object, or vice-versa.fixEnsure you are using the correct class for the desired calculation. Use `Regression` for linear regression attributes (`slope`, `intercept`, `correlation`) and `Statistics` for basic descriptive statistics (`mean`, `stddev`, `minimum`, etc.). -
TypeError: object of type 'ExponentialStatistics' has no len()
cause Attempting to retrieve the count of items in an `ExponentialStatistics` object using `len()`, which is not supported.fixThe `ExponentialStatistics` class does not track a fixed count due to its decaying nature. If a count of operations is needed, implement a manual counter alongside the `ExponentialStatistics` object.
Warnings
- gotcha The `runstats` library includes an optional Cython-optimized extension for significant performance improvements (20-40x faster) over the pure-Python version. If Cython is not installed or the extension fails to build, the library will silently fall back to the slower pure-Python implementation. Users should verify installation to ensure optimal performance. [1, 3, 4]
- gotcha The `ExponentialStatistics` class does not support the `len()` method, unlike `Statistics` and `Regression` objects. Attempting to call `len()` on an `ExponentialStatistics` instance will raise a `TypeError`. This is by design, as exponential statistics decay older values and do not represent a fixed count. [4]
- gotcha When combining `ExponentialStatistics` objects using the `+` operator, the resulting object's decay rate will be inherited from the leftmost `ExponentialStatistics` object in the operation. This behavior can be unexpected if different decay rates are involved. [4]
Install
-
pip install runstats
Imports
- Statistics
from runstats import Statistics
- Regression
from runstats import Regression
- ExponentialStatistics
from runstats import ExponentialStatistics
Quickstart
import random
from runstats import Statistics, Regression
# --- Statistics Example ---
stats = Statistics()
for _ in range(100):
stats.push(random.random() * 100)
print(f"Statistics Count: {len(stats)}")
print(f"Statistics Mean: {stats.mean():.2f}")
print(f"Statistics Std Dev: {stats.stddev():.2f}")
print(f"Statistics Min: {stats.minimum():.2f}")
print(f"Statistics Max: {stats.maximum():.2f}")
# --- Regression Example ---
regr = Regression()
def linear_noisy_func(x_coord):
alpha, beta = 1.5, 5.0
noise = (2 * (random.random() - 0.5))
return alpha * x_coord + beta + noise
for i in range(100):
x_val = i * 0.1
y_val = linear_noisy_func(x_val)
regr.push(x_val, y_val)
print(f"\nRegression Count: {len(regr)}")
print(f"Regression Slope: {regr.slope():.2f}")
print(f"Regression Intercept: {regr.intercept():.2f}")
print(f"Regression Correlation: {regr.correlation():.2f}")