Awkward Array
Awkward Array is a Python library for manipulating nested, variable-sized data (like JSON) with NumPy-like idioms. It provides dynamically typed arrays that are compiled for fast operations, generalizing NumPy's behavior for irregular data structures. The library is actively developed with frequent releases, currently at version 2.9.0.
Warnings
- breaking Awkward Array v2.9.0 and later drops support for Python 3.9. Users on Python 3.9 must upgrade their Python version to 3.10 or newer.
- breaking Version 2.x represents a major rewrite from Awkward Array 1.x. Users migrating from 1.x will encounter significant API changes and should consult the official migration guide.
- deprecated The JAX backend for Awkward Array is deprecated and will be removed in a future release. Users relying on JAX integration should plan to migrate their code.
- gotcha If pre-compiled binary wheels for `awkward-cpp` (a core dependency) are not available for your specific platform and Python version, `pip` will attempt to compile it from source. This requires a C++ compiler and associated development tools to be installed on your system.
- gotcha The Awkward Array JAX backend does not support JIT compilation (via `jax.jit`) on reducers (e.g., `ak.sum`). This is due to limitations in JAX's XLA model, which requires array sizes to not be data-dependent at compile-time.
- gotcha While Awkward Array includes fixes for compatibility with specific NumPy 2.x versions (e.g., 2.3), general breaking changes introduced in NumPy 2.0 (such as data type promotion rules and changes to the `copy` keyword behavior) can still affect user code that directly interacts with NumPy or expects specific type behaviors. Users should review the NumPy 2.0 migration guide.
Install
-
pip install awkward -
conda install -c conda-forge awkward
Imports
- ak
import awkward as ak
Quickstart
import awkward as ak
import numpy as np
array = ak.Array([
[{"x": 1.1, "y": [1]}, {"x": 2.2, "y": [1, 2]}, {"x": 3.3, "y": [1, 2, 3]}],
[],
[{"x": 4.4, "y": [1, 2, 3, 4]}, {"x": 5.5, "y": [1, 2, 3, 4, 5]}]
])
# Slice out the y values, drop the first element from each inner list, and square them
output = np.square(array["y", ..., 1:])
print(output)