Vortex Data
Vortex Data (`vortex-data`) provides Python bindings for Vortex, an Apache Arrow-compatible toolkit designed for working with compressed array data. It focuses on efficient storage, processing, and retrieval of large datasets, often used in scenarios requiring high-performance data analytics and database systems. The library is actively developed with frequent, often monthly, releases, currently at version 0.68.0.
Common errors
-
AttributeError: module 'vortex.array' has no attribute 'Array'
cause Attempting to import or use the `Array` class directly after it was renamed to `DynArray` or its usage pattern shifted.fixUse `vortex.array.array()` or `vortex.array.scalar()` to create array instances. For direct class reference, use `vortex.array.DynArray`. -
TypeError: 'vortex.array.DynArray' object has no attribute 'serialize'
cause Calling the old `serialize` method directly on a Vortex array instance, which was removed and moved to a plugin interface in v0.68.0.fixThe serialization API has changed. Refer to the Vortex documentation for the correct way to serialize arrays using `ArrayPlugin`. -
NameError: name 'old_compute_function' is not defined (or similar AttributeError on vortex.compute)
cause Using a compute function that was removed, renamed, or significantly refactored in version 0.61.0 or later.fixConsult the latest Vortex documentation for the updated compute API. Many functions might have changed names, signatures, or been removed.
Warnings
- breaking The core `Array` class was renamed to `DynArray` in version 0.62.0. Direct imports or references to `vortex.array.Array` will result in an `AttributeError`.
- breaking The compute functions API underwent significant changes in version 0.61.0, with many public functions being removed, renamed, or refactored. Code relying on previous compute APIs will likely break.
- breaking The serialization API for Vortex arrays changed in version 0.68.0. The `serialize` method was moved from the `Array` object to `ArrayPlugin::Serialize`.
- breaking The `vortex-scan` API was refactored in version 0.66.0 to be exclusively focused on the Scan API, potentially affecting modules or functions related to scanning outside this scope.
Install
-
pip install vortex-data
Imports
- vortex
import vortex
- array
from vortex.array import Array
import vortex.array
Quickstart
import vortex
import numpy as np
# Create a Vortex scalar array with an explicit dtype
scalar_arr = vortex.array.scalar(10, dtype=vortex.DType.Int32)
print(f"Scalar Array: {scalar_arr}\n")
# Create a Vortex array from a NumPy array
numpy_arr = np.arange(10, dtype=np.int32)
complex_arr = vortex.array.array(numpy_arr)
print(f"Array from NumPy: {complex_arr}\n")
# Get its size and convert to Apache Arrow
print(f"Size in bytes: {complex_arr.nbytes}\n")
arrow_array = complex_arr.to_arrow()
print(f"Converted to Arrow: {arrow_array}")