fastparquet

2026.3.0 · deprecated · verified Mon Apr 06

fastparquet is a Python library providing performant read/write support for the Parquet file format, without needing a Python-Java bridge. It integrates well with Python-based big data workflows, particularly Dask and Pandas (versions < 3.0). As of March 2026, with Pandas 3.0 explicitly depending on PyArrow, `fastparquet` is being retired, and no further development is anticipated, though it remains usable for Pandas 2.x users.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to create a Pandas DataFrame, write it to a Parquet file using `fastparquet.write`, and then read the data back into a new DataFrame using `fastparquet.ParquetFile.to_pandas()`. It also includes basic file cleanup.

import pandas as pd
from fastparquet import write, ParquetFile
import os

# Create a sample DataFrame
df = pd.DataFrame({
    'col1':,
    'col2': ['A', 'B', 'C', 'D'],
    'col3': [True, False, True, False]
})

filename = "example.parquet"

# Write the DataFrame to a Parquet file with Snappy compression
write(filename, df, compression='SNAPPY')
print(f"DataFrame successfully written to '{filename}'.")

# Read the Parquet file back into a DataFrame
pf = ParquetFile(filename)
df_read = pf.to_pandas()
print(f"DataFrame successfully read from '{filename}':")
print(df_read)

# Clean up the created file
os.remove(filename)

view raw JSON →