PyArrow
PyArrow is a Python library for Apache Arrow, providing a standardized language-independent columnar memory format for flat and hierarchical data. The current version is 23.0.1, released on March 28, 2026, with a regular release cadence.
Warnings
- breaking PyArrow 23.0.1 introduces changes to the `pyarrow.fs` module, affecting file system operations. Review the release notes for migration details.
- deprecated The `pyarrow.parquet` module's `ParquetFile` class is deprecated as of version 23.0.1. Use `ParquetDataset` instead.
Install
-
pip install pyarrow
Imports
- pa
import pyarrow as pa
Quickstart
import pyarrow as pa
# Create a simple Arrow Table
data = {'column1': [1, 2, 3], 'column2': ['A', 'B', 'C']}
table = pa.table(data)
# Write the Table to a Parquet file
import pyarrow.parquet as pq
pq.write_table(table, 'example.parquet')
# Read the Table back from the Parquet file
table_read = pq.read_table('example.parquet')
print(table_read)