Pangres

raw JSON →
4.2.1 verified Mon Apr 27 auth: no python

Pangres is a Python library that provides efficient upsert (insert/update) functionality for pandas DataFrames into SQL databases (PostgreSQL, MySQL, SQLite) via SQLAlchemy. It supports both synchronous and asynchronous operations, with automatic table creation and schema inference. The current version is 4.2.1, with monthly or irregular releases.

pip install pangres
error AttributeError: module 'pangres' has no attribute 'upsert'
cause After 'import pangres', the upsert function is not directly accessible as a module attribute in older versions or due to missing exports.
fix
Use 'from pangres import upsert' or upgrade to v4.2.1 where exports are fixed.
error sqlalchemy.exc.InvalidRequestError: A value is required for a bound parameter in the WHERE clause
cause Pangres requires the DataFrame index to be named (as a column in the database). An unnamed index or MultiIndex with duplicate level names causes this.
fix
Ensure the DataFrame index has a name (e.g., df.index.name = 'id'). For MultiIndex, ensure all level names are unique and match the database primary key columns.
breaking In v4.0, the first argument 'engine' was renamed to 'con', and now accepts both engines and connections. Old code using 'engine=' keyword will break.
fix Use 'con=' instead of 'engine='.
breaking In v4.0, the 'chunksize' parameter default changed from 10000 to None (insert all rows at once). If you relied on the old default, your upserts may now attempt huge transactions.
fix Explicitly set chunksize=10000 to restore old behavior.
gotcha When using async (aupsert), the input 'con' must be an async SQLAlchemy engine or connection, not a sync one. Mixing sync/async will raise an error.
fix Ensure you create an async engine with create_async_engine from sqlalchemy.ext.asyncio.
deprecated Support for Python 3.12 was added in v4.2.0 with a new dependency 'packaging'. Users on Python 3.12 must upgrade to v4.2.0+.
fix Upgrade pangres to >=4.2.0.

Inserts or updates rows in PostgreSQL using a DataFrame with a primary key index. The engine can be replaced with a connection for transaction control.

import pandas as pd
from sqlalchemy import create_engine
from pangres import upsert

engine = create_engine('postgresql://user:pass@localhost/db')
df = pd.DataFrame({'id': [1, 2], 'value': ['a', 'b']}).set_index('id')
upsert(con=engine, df=df, table_name='my_table', if_row_exists='update')