DuckDB in-process database

1.5.1 verified Tue May 12 auth: no python install: draft

DuckDB is an in-process SQL OLAP (Online Analytical Processing) database management system designed for fast analytical queries directly within your Python application. It operates without a separate server, integrating seamlessly with the Python data ecosystem like Pandas and Polars. It is actively maintained with frequent releases, currently at version 1.5.1, and focuses on efficient data handling for large datasets.

pip install duckdb

Common errors

error ImportError: DLL load failed while importing duckdb: The specified module could not be found. ↓

cause This error typically occurs on Windows when the necessary Microsoft Visual C++ Redistributable package is missing or outdated, which DuckDB's pre-compiled binaries (wheels) depend on.

fix

Install the latest Microsoft Visual C++ Redistributable package from Microsoft's website. Alternatively, force pip to compile DuckDB from source using python3 -m pip install duckdb --no-binary duckdb (this requires a C++ compiler).

error FATAL Error: Failed: database has been invalidated because of a previous fatal error. The database must be restarted prior to being used again. ↓

cause DuckDB enters a 'restricted mode' after encountering an internal error or crash, leaving the database in an undefined state and preventing further operations until restarted.

fix

Close the current DuckDB connection or session and then start a new connection to the database. If working with a persistent database file, DuckDB will attempt to replay the write-ahead log upon reconnection to restore the database to its state before the crash.

error AttributeError: 'NoneType' object has no attribute 'execute' ↓

cause This error occurs when `duckdb.connect()` fails to establish a connection and returns `None`, and subsequent code attempts to call methods like `execute()` or `close()` on this `None` object.

fix

Ensure that the duckdb.connect() call is successful and returns a valid connection object. Check for any parameters passed to connect() (e.g., database file paths) that might be incorrect, and handle potential connection failures by verifying the returned object is not None before proceeding.

error IO Error: Cannot open file "...": The process cannot access the file because it is being used by another process. ↓

cause This error happens when multiple processes, applications, or even multiple connections within the same application attempt to open and write to the same DuckDB database file concurrently, leading to a file lock.

fix

Ensure that only one process or connection is accessing the DuckDB database file at a time. If using multiple connections, ensure they are properly closed after use, or consider using in-memory databases (:memory:) or distinct file paths if concurrent access is truly needed (though DuckDB is designed for single-process, multi-threaded use with shared connections).

error BinderException: Binder Error: Referenced column "..." not found in FROM clause! ↓

cause This typically occurs in the Python API when referencing a column that resulted from an aggregate function (e.g., 'sum(pnl)') within a chained relational API call, as DuckDB's binder might incorrectly interpret the string as an expression rather than a column name.

fix

Explicitly alias aggregated columns using the .alias() method immediately after the aggregation in the relational API, or use direct SQL queries with aliases to make the column names unambiguous for subsequent operations.

Warnings

breaking Python 3.9 support has been dropped with DuckDB Python v1.5.0. Users on Python 3.9 will encounter errors. ↓

fix Upgrade your Python environment to version 3.10 or newer. DuckDB v1.5.0 requires Python >=3.10.0.

breaking The `duckdb.typing` and `duckdb.functional` modules were removed in v1.5.0, having been deprecated in v1.4.0. ↓

fix Replace imports and usage of `duckdb.typing` with `duckdb.sqltypes`, and `duckdb.functional` with `duckdb.func`.

deprecated The methods `fetch_arrow_table()` and `fetch_record_batch()` on connections and relations have been deprecated. ↓

fix Use the new `to_arrow_table()` and `to_arrow_reader()` methods instead for Arrow export APIs.

gotcha DuckDB's persistent storage format is not stable across major/minor versions prior to v1.0. Upgrading DuckDB can lead to `IOException` when trying to read older database files. ↓

fix If you encounter this, load the old database file with the DuckDB version that created it, `EXPORT DATABASE` to a new location, then `IMPORT DATABASE` with the newer DuckDB version. After DuckDB v0.10, the storage format is backwards-compatible.

gotcha The `column` parameter in relational API functions (e.g., `min`, `max`, `sum`) was renamed to `expression` to better reflect that it accepts expressions, not just column names. ↓

fix Update calls to these relational API functions to use `expression` instead of `column`.

deprecated The lambda arrow syntax `x -> x + 1` in SQL queries is deprecated in v1.5.0 and will emit a warning. ↓

fix Transition to the new Python-style lambda syntax: `lambda x: x + 1`. You can configure `lambda_syntax` to change behavior.

gotcha Building `duckdb` from source requires a C++ compiler (like `g++`) and potentially other build tools (e.g., `cmake`). Minimal environments like Alpine Linux often lack these by default, leading to build failures. ↓

fix Install necessary build tools like `g++` and `cmake` in your environment before attempting to install `duckdb`. For Alpine Linux, this typically involves `apk add build-base cmake`.

Install compatibility draft last tested: 2026-05-12

python os / libc status wheel install import disk

3.10 alpine (musl) build_error - - - -

3.10 alpine (musl) - - - -

3.10 slim (glibc) wheel 2.2s 0.10s 77M

3.10 slim (glibc) - - 0.10s 77M

3.11 alpine (musl) build_error - - - -

3.11 alpine (musl) - - - -

3.11 slim (glibc) wheel 2.2s 0.14s 79M

3.11 slim (glibc) - - 0.15s 79M

3.12 alpine (musl) build_error - - - -

3.12 alpine (musl) - - - -

3.12 slim (glibc) wheel 2.0s 0.17s 71M

3.12 slim (glibc) - - 0.17s 71M

3.13 alpine (musl) build_error - - - -

3.13 alpine (musl) - - - -

3.13 slim (glibc) wheel 2.0s 0.16s 70M

3.13 slim (glibc) - - 0.15s 70M

3.9 alpine (musl) build_error - - - -

3.9 alpine (musl) - - - -

3.9 slim (glibc) wheel 2.6s 0.10s 74M

3.9 slim (glibc) - - 0.10s 74M

Imports

duckdb
wrong
```
from duckdb import connect
```
correct
```
import duckdb
```
While 'from duckdb import connect' works, 'import duckdb' is more common as many functions (like duckdb.sql) operate on a default in-memory connection if no explicit connection object is created.
duckdb.sqltypes
wrong
```
from duckdb import typing
```
correct
```
from duckdb import sqltypes
```
The `duckdb.typing` module was deprecated in 1.4.0 and removed in 1.5.0; use `duckdb.sqltypes` instead for type definitions.
duckdb.func
wrong
```
from duckdb import functional
```
correct
```
from duckdb import func
```
The `duckdb.functional` module was deprecated in 1.4.0 and removed in 1.5.0; use `duckdb.func` instead for functional APIs.

Quickstart last tested: 2026-04-24

This quickstart demonstrates how to connect to an in-memory DuckDB database, execute SQL queries, insert data, and retrieve results as a Pandas DataFrame. It also shows the convenience of using the default global in-memory database directly via `duckdb.sql()` for quick operations. For persistent storage, specify a file path in `duckdb.connect()`.

import duckdb

# Connect to an in-memory database (data is lost after session)
con = duckdb.connect(database=':memory:')

# Execute a SQL query and show results
result = con.sql("SELECT 42 AS answer").show()

# Create a table and insert data
con.execute("CREATE TABLE my_table (id INTEGER, name VARCHAR)")
con.execute("INSERT INTO my_table VALUES (1, 'Alice'), (2, 'Bob')")

# Query the table and fetch results as a Pandas DataFrame
df_result = con.sql("SELECT * FROM my_table WHERE id = 1").df()
print(df_result)

# Example of using the default global in-memory database
df_global = duckdb.sql("SELECT 'Hello, DuckDB!' AS message").df()
print(df_global)