Awkward Array C++ Kernels
awkward-cpp provides the highly optimized C++ and CUDA kernels and compiled extensions that power the Awkward Array library (version 2.x). It serves as a performance backend, enabling NumPy-like idioms for nested, variable-sized data structures to run at compiled speeds. The library itself is not intended for direct end-user interaction but is a core dependency of the main `awkward` package. It is currently at version 52 and maintains an active release cadence aligned with `awkward` releases.
Warnings
- breaking As of `awkward` version 2.9.0 (and consequently `awkward-cpp` version 52 and later), Python 3.9 is no longer supported. Users must upgrade to Python 3.10 or newer.
- gotcha awkward-cpp primarily provides CPU kernels. For GPU acceleration, `awkward` leverages CuPy and specific CUDA kernels. Users must explicitly use `ak.to_backend('cuda')` and ensure CuPy is installed if they intend to run operations on a GPU, as `awkward-cpp` does not handle GPU computation itself.
- gotcha Awkward Arrays are designed to be immutable, meaning operations create new arrays rather than modifying existing ones in-place. However, if an Awkward Array is created from a mutable underlying data structure (like a NumPy array), modifying the original data structure in-place will also modify the Awkward Array referencing it. To prevent this, make an explicit deep copy using `ak.copy()`.
- gotcha While `awkward-cpp` is installable directly via pip, it is primarily a low-level dependency. Attempting to import or use it directly for high-level data manipulation will not work as it does not expose a public Python API for end-users. All high-level functionality is provided by the `awkward` library.
Install
-
pip install awkward-cpp -
pip install awkward
Imports
- awkward-cpp
Functionality is accessed via the 'awkward' library, e.g., 'import awkward as ak'
Quickstart
import awkward as ak
import numpy as np
# Create a nested, variable-sized Awkward Array
array = ak.Array([[1, 2, 3], [], [4, 5]])
print(f"Original Array: {array}, type: {array.type}")
# Perform a vectorized operation (e.g., multiply by 2)
result = array * 2
print(f"Result of array * 2: {result}, type: {result.type}")
# Use a NumPy ufunc directly (works because of Awkward's NumPy-like behavior)
sum_per_list = ak.sum(array, axis=1)
print(f"Sum per list: {sum_per_list}, type: {sum_per_list.type}")