PyMongoArrow

raw JSON →
1.13.0 verified Mon Apr 27 auth: no python

PyMongoArrow bridges MongoDB with NumPy, Pandas, Polars, and PyArrow. Version 1.13.0 adds parallel batch processing and Polars ExtensionTypes support. Released ~monthly.

pip install pymongoarrow
error ImportError: cannot import name 'patch_all' from 'pymongoarrow'
cause Import path changed in older versions; patch_all is in pymongoarrow.monkey.
fix
Use: from pymongoarrow.monkey import patch_all
error TypeError: 'Schema' object is not callable
cause Schema is not a function; it's a class that takes keyword arguments.
fix
Use: Schema({'field': type}) instead of Schema('field', type)
error pymongo.errors.OperationFailure: unknown command: aggregation
cause MongoDB server version too old (<3.6) or missing aggregation support.
fix
Upgrade MongoDB to >=3.6 or use a different query method.
breaking In v1.11.0, Python 3.9 support was dropped and Python 3.13 free-threaded was dropped on some platforms. The TypeRegistry in write() now extends rather than replaces.
fix Upgrade to Python >=3.10. If writing Arrow data, do not expect old TypeRegistry replacement behavior.
deprecated PyMongoArrow 1.13.0 is still using pymongo.monkey.patch_all() which may change in future versions.
fix Monitor release notes for deprecation; consider using API directly with Schema and find_arrow_all.
gotcha Schema inference may overflow for large integers (e.g., int64) if not explicitly defined. In v1.13.0, schema inference raises OverflowError for out-of-range values.
fix Always define Schema explicitly for fields that may exceed inferred type range.
gotcha pandas is an optional dependency; if not installed, find_pandas_all() will raise ImportError.
fix Install pymongoarrow[pandas] or pandas separately.
pip install pymongoarrow[polars]
pip install pymongoarrow[pandas]
pip install pymongoarrow[all]

Basic usage: patch pymongo, define schema, query collections to get Arrow/Pandas/Polars outputs.

import os
import pymongo
from pymongoarrow.api import Schema
from pymongoarrow.monkey import patch_all

# Patch pymongo to enable arrow operations
patch_all()

client = pymongo.MongoClient(os.environ.get('MONGODB_URI', 'mongodb://localhost:27017'))
db = client.test
coll = db.mydata

# Define schema (field name: type)
schema = Schema({'name': str, 'age': int, 'city': str})

# Insert sample data
coll.insert_many([
    {'name': 'Alice', 'age': 30, 'city': 'NYC'},
    {'name': 'Bob', 'age': 25, 'city': 'SF'},
])

# Fetch as Arrow table
import pyarrow as pa
table = coll.find_arrow_all({}, schema=schema)
print(table)

# Convert to pandas
df = coll.find_pandas_all({}, schema=schema)
print(df)

# Convert to polars
import polars as pl
pl_df = coll.find_polars_all({}, schema=schema)
print(pl_df)