DuckDB in-process database
raw JSON → 1.5.1 verified Tue May 12 auth: no python install: draft
DuckDB is an in-process SQL OLAP (Online Analytical Processing) database management system designed for fast analytical queries directly within your Python application. It operates without a separate server, integrating seamlessly with the Python data ecosystem like Pandas and Polars. It is actively maintained with frequent releases, currently at version 1.5.1, and focuses on efficient data handling for large datasets.
pip install duckdb Common errors
error ImportError: DLL load failed while importing duckdb: The specified module could not be found. ↓
cause This error typically occurs on Windows when the necessary Microsoft Visual C++ Redistributable package is missing or outdated, which DuckDB's pre-compiled binaries (wheels) depend on.
fix
Install the latest Microsoft Visual C++ Redistributable package from Microsoft's website. Alternatively, force pip to compile DuckDB from source using
python3 -m pip install duckdb --no-binary duckdb (this requires a C++ compiler). error FATAL Error: Failed: database has been invalidated because of a previous fatal error. The database must be restarted prior to being used again. ↓
cause DuckDB enters a 'restricted mode' after encountering an internal error or crash, leaving the database in an undefined state and preventing further operations until restarted.
fix
Close the current DuckDB connection or session and then start a new connection to the database. If working with a persistent database file, DuckDB will attempt to replay the write-ahead log upon reconnection to restore the database to its state before the crash.
error AttributeError: 'NoneType' object has no attribute 'execute' ↓
cause This error occurs when `duckdb.connect()` fails to establish a connection and returns `None`, and subsequent code attempts to call methods like `execute()` or `close()` on this `None` object.
fix
Ensure that the
duckdb.connect() call is successful and returns a valid connection object. Check for any parameters passed to connect() (e.g., database file paths) that might be incorrect, and handle potential connection failures by verifying the returned object is not None before proceeding. error IO Error: Cannot open file "...": The process cannot access the file because it is being used by another process. ↓
cause This error happens when multiple processes, applications, or even multiple connections within the same application attempt to open and write to the same DuckDB database file concurrently, leading to a file lock.
fix
Ensure that only one process or connection is accessing the DuckDB database file at a time. If using multiple connections, ensure they are properly closed after use, or consider using in-memory databases (
:memory:) or distinct file paths if concurrent access is truly needed (though DuckDB is designed for single-process, multi-threaded use with shared connections). error BinderException: Binder Error: Referenced column "..." not found in FROM clause! ↓
cause This typically occurs in the Python API when referencing a column that resulted from an aggregate function (e.g., 'sum(pnl)') within a chained relational API call, as DuckDB's binder might incorrectly interpret the string as an expression rather than a column name.
fix
Explicitly alias aggregated columns using the
.alias() method immediately after the aggregation in the relational API, or use direct SQL queries with aliases to make the column names unambiguous for subsequent operations. Warnings
breaking Python 3.9 support has been dropped with DuckDB Python v1.5.0. Users on Python 3.9 will encounter errors. ↓
fix Upgrade your Python environment to version 3.10 or newer. DuckDB v1.5.0 requires Python >=3.10.0.
breaking The `duckdb.typing` and `duckdb.functional` modules were removed in v1.5.0, having been deprecated in v1.4.0. ↓
fix Replace imports and usage of `duckdb.typing` with `duckdb.sqltypes`, and `duckdb.functional` with `duckdb.func`.
deprecated The methods `fetch_arrow_table()` and `fetch_record_batch()` on connections and relations have been deprecated. ↓
fix Use the new `to_arrow_table()` and `to_arrow_reader()` methods instead for Arrow export APIs.
gotcha DuckDB's persistent storage format is not stable across major/minor versions prior to v1.0. Upgrading DuckDB can lead to `IOException` when trying to read older database files. ↓
fix If you encounter this, load the old database file with the DuckDB version that created it, `EXPORT DATABASE` to a new location, then `IMPORT DATABASE` with the newer DuckDB version. After DuckDB v0.10, the storage format is backwards-compatible.
gotcha The `column` parameter in relational API functions (e.g., `min`, `max`, `sum`) was renamed to `expression` to better reflect that it accepts expressions, not just column names. ↓
fix Update calls to these relational API functions to use `expression` instead of `column`.
deprecated The lambda arrow syntax `x -> x + 1` in SQL queries is deprecated in v1.5.0 and will emit a warning. ↓
fix Transition to the new Python-style lambda syntax: `lambda x: x + 1`. You can configure `lambda_syntax` to change behavior.
gotcha Building `duckdb` from source requires a C++ compiler (like `g++`) and potentially other build tools (e.g., `cmake`). Minimal environments like Alpine Linux often lack these by default, leading to build failures. ↓
fix Install necessary build tools like `g++` and `cmake` in your environment before attempting to install `duckdb`. For Alpine Linux, this typically involves `apk add build-base cmake`.
Install compatibility draft last tested: 2026-05-12
python os / libc status wheel install import disk
3.10 alpine (musl) build_error - - - -
3.10 alpine (musl) - - - -
3.10 slim (glibc) wheel 2.2s 0.10s 77M
3.10 slim (glibc) - - 0.10s 77M
3.11 alpine (musl) build_error - - - -
3.11 alpine (musl) - - - -
3.11 slim (glibc) wheel 2.2s 0.14s 79M
3.11 slim (glibc) - - 0.15s 79M
3.12 alpine (musl) build_error - - - -
3.12 alpine (musl) - - - -
3.12 slim (glibc) wheel 2.0s 0.17s 71M
3.12 slim (glibc) - - 0.17s 71M
3.13 alpine (musl) build_error - - - -
3.13 alpine (musl) - - - -
3.13 slim (glibc) wheel 2.0s 0.16s 70M
3.13 slim (glibc) - - 0.15s 70M
3.9 alpine (musl) build_error - - - -
3.9 alpine (musl) - - - -
3.9 slim (glibc) wheel 2.6s 0.10s 74M
3.9 slim (glibc) - - 0.10s 74M
Imports
- duckdb wrong
from duckdb import connectcorrectimport duckdb - duckdb.sqltypes wrong
from duckdb import typingcorrectfrom duckdb import sqltypes - duckdb.func wrong
from duckdb import functionalcorrectfrom duckdb import func
Quickstart last tested: 2026-04-24
import duckdb
# Connect to an in-memory database (data is lost after session)
con = duckdb.connect(database=':memory:')
# Execute a SQL query and show results
result = con.sql("SELECT 42 AS answer").show()
# Create a table and insert data
con.execute("CREATE TABLE my_table (id INTEGER, name VARCHAR)")
con.execute("INSERT INTO my_table VALUES (1, 'Alice'), (2, 'Bob')")
# Query the table and fetch results as a Pandas DataFrame
df_result = con.sql("SELECT * FROM my_table WHERE id = 1").df()
print(df_result)
# Example of using the default global in-memory database
df_global = duckdb.sql("SELECT 'Hello, DuckDB!' AS message").df()
print(df_global)