DuckDB-Engine
duckdb-engine is an SQLAlchemy driver for DuckDB, a high-performance analytical in-process SQL database system. It enables Python applications to interact with DuckDB databases using SQLAlchemy's ORM and SQL Expression Language. Currently at `v0.17.0`, it receives frequent updates addressing bug fixes and introducing features like filesystem registration.
Warnings
- breaking Python 3.8 support was dropped in `v0.16.0`. Attempting to use `duckdb-engine` with Python 3.8 will fail.
- gotcha Prior to `v0.15.1`, there were potential panics in multi-threaded environments. DuckDB generally follows a single-writer, multiple-reader concurrency model.
- gotcha SQLAlchemy's `SERIAL` datatype, typically used for auto-incrementing primary keys in PostgreSQL, is not directly supported by DuckDB, which can lead to issues with ORM-generated schemas.
- gotcha The `duckdb-engine` dialect is derived from PostgreSQL, and as such, SQLAlchemy may attempt to use PostgreSQL-only features not supported by DuckDB's SQL parser.
- gotcha Division operations might produce unexpected results due to default casting behavior (`div_is_floordiv=True` by default) prior to `v0.14.1`.
- gotcha Connecting to MotherDuck databases requires specific connection string parameters, including an optional `motherduck_token` for authentication.
Install
-
pip install duckdb-engine -
conda install duckdb-engine -c conda-forge
Imports
- create_engine
from sqlalchemy import create_engine
Quickstart
from sqlalchemy import create_engine, text
# Connect to an in-memory DuckDB database
engine = create_engine("duckdb:///:memory:")
with engine.connect() as connection:
# Execute a simple DDL statement
connection.execute(text("CREATE TABLE users (id INTEGER, name VARCHAR)"))
connection.commit()
# Execute an INSERT statement
connection.execute(text("INSERT INTO users (id, name) VALUES (:id, :name)"), {"id": 1, "name": "Alice"})
connection.execute(text("INSERT INTO users (id, name) VALUES (:id, :name)"), {"id": 2, "name": "Bob"})
connection.commit()
# Execute a SELECT statement and fetch results
result = connection.execute(text("SELECT id, name FROM users"))
for row in result:
print(f"ID: {row.id}, Name: {row.name}")