Awkward Array Pandas Extension

raw JSON →
2023.8.0 verified Fri May 01 auth: no python

Awkward-pandas integrates Awkward Arrays with pandas, allowing columnar data with variable-length lists, nested structures, and missing values to be used in DataFrames. Version 2023.8.0 supports Python >=3.8. Provides storage extension and accessor for seamless conversion. Low maintenance; release cadence is sporadic.

pip install awkward-pandas
error TypeError: Cannot interpret 'list' as an Awkward Array
cause Passing a Python list to AwkwardExtensionArray constructor instead of an ak.Array.
fix
Wrap in ak.Array: AwkwardExtensionArray(ak.Array(your_list))
error ModuleNotFoundError: No module named 'awkward_pandas'
cause Package not installed or installed in wrong environment.
fix
Run 'pip install awkward-pandas' in your Python environment.
error ValueError: The 'data' argument must be an Awkward Array
cause Constructor called with incompatible type (e.g., numpy array).
fix
Convert to ak.Array first: AwkwardExtensionArray(ak.Array(np_array))
breaking Version 2023.8.0 may be incompatible with older versions of awkward (<2.0). Awkward 2.0 introduced breaking changes in its API (e.g., `ak.Array` construction). Ensure awkward >=2.0 is installed.
fix Upgrade awkward to >=2.0.0: pip install 'awkward>=2.0.0'
gotcha The `AwkwardExtensionArray` constructor expects an Awkward Array, not a list or numpy array. Passing a plain Python list will raise a TypeError.
fix Ensure the argument is an `ak.Array` object: `AwkwardExtensionArray(ak.Array([1, 2, 3]))`
deprecated The `.ak` accessor methods are not fully aligned with pandas 2.0+ API. Some operations (e.g., `.ak.num()`) may return unexpected types in newer pandas versions. Prefer explicit conversion to numpy/awkward before aggregations.
fix Convert to Awkward Array and use native awkward operations instead of accessor for critical code.

Create a DataFrame with an Awkward Array column and use the .ak accessor.

import awkward as ak
import pandas as pd
from awkward_pandas import AwkwardExtensionArray

# Create an Awkward Array
array = ak.Array([[1, 2], [3], None])

# Create a DataFrame with AwkwardExtensionArray column
ser = pd.Series(AwkwardExtensionArray(array), name='col')
df = ser.to_frame()
print(df)

# Use the accessor for operations
print(df['col'].ak.num())