Apache DataFusion Python

library 52.3.0 ·python

✓ verified Jun 28, 2026

A Python library that provides bindings to the Apache Arrow in-memory query engine, DataFusion. It enables users to build and execute high-performance queries using SQL or a DataFrame API against various data sources, including CSV, Parquet, JSON, and in-memory data. Leveraging its Rust-written query engine, it focuses on efficient, zero-copy data exchange with PyArrow. The library is actively maintained, with a current version of 52.3.0, and typically releases in sync with the core DataFusion project.

Traffic · last 30 days ↑200% vs prev 7d · indexed Sat Apr 11 · updated Sat Jul 11

total hits 14

actors 4 distinct systems

last hit 3d ago AhrefsBot

GPTBot

Script

ByteDance

Humans

top countries 🇺🇸 United States · 🇨🇦 Canada · BD · 🇸🇬 Singapore · 🇹🇷 Turkey

Resources

docsdatafusion.apache.org/python ↗

githubgithub.com/apache/datafusion-python ↗

packagepypi.org/project/datafusion/ ↗

API endpoints

full doc /v1/registry/datafusion

install /v1/registry/datafusion/install

compatibility /v1/registry/datafusion/compatibility