Numbagg: Fast N-dimensional Aggregation

0.9.4 · active · verified Tue Apr 14

Numbagg provides fast N-dimensional aggregation functions accelerated by Numba's just-in-time (JIT) compiler and NumPy's generalized universal function (gufunc) machinery. It aims to outperform libraries like pandas, bottleneck, and NumPy for certain operations, especially with parallelization. The library is currently at version 0.9.4 and maintains an active development pace with regular updates.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `numbagg` for basic array aggregation (nansum) and moving window calculations (move_mean) on NumPy arrays. Note that the first call to any numbagg function will incur JIT compilation overhead.

import numbagg
import numpy as np

a = np.array([1, 2, np.nan, 4, 5])
b = np.random.rand(10, 5)

# Calculate sum, ignoring NaNs
sum_result = numbagg.nansum(a)
print(f"nansum(a): {sum_result}")

# Calculate moving mean with a window of 3
moving_mean_result = numbagg.move_mean(b, window=3, axis=1)
print(f"move_mean(b, window=3, axis=1, shape): {moving_mean_result.shape}")

view raw JSON →