NucliaDB Dataset

raw JSON →
6.13.0.post6240 verified Sat May 09 auth: no python

Python client for exporting and managing datasets from NucliaDB, a semantic search engine for unstructured data. Version 6.13.0.post6240 supports Python 3.10+. Actively maintained on GitHub.

pip install nucliadb-dataset
error ImportError: cannot import name 'NucliaDataset' from 'nucliadb_dataset'
cause Package not installed or installed with wrong name. Also possible version mismatch.
fix
Run pip install nucliadb-dataset and verify version >=6.0.0.
error TypeError: __init__() missing 1 required positional argument: 'api_key'
cause API key not provided or passed incorrectly.
fix
Pass api_key as keyword argument: NucliaDataset(api_key='your-key').
breaking In v6.0+, the dataset API was restructured. Methods like `list_datasets()` were renamed to `list()` and `get_dataset()` to `get()`. Old code using deprecated method names will fail.
fix Update method calls: `list()` and `get()` instead of `list_datasets()` and `get_dataset()`.
deprecated The `NucliaDataset` constructor no longer accepts a `region` parameter; use `endpoint` instead.
fix Replace `region='europe-1'` with `endpoint='https://europe-1.nucliadb.com'`.
gotcha API key is required and must be passed as keyword argument `api_key`. Passing it as positional argument causes TypeError.
fix Always use keyword argument: `NucliaDataset(api_key='...', endpoint='...')`.

Basic usage: initialize client, list and retrieve datasets.

from nucliadb_dataset import NucliaDataset
import os

# Initialize with your NucliaDB API key
api_key = os.environ.get('NUCLIADB_API_KEY', '')
dataset = NucliaDataset(api_key=api_key, endpoint='https://your-nucliadb-instance')

# List available datasets
datasets = dataset.list()
print(datasets)

# Fetch a specific dataset by ID
if datasets:
    ds = dataset.get(datasets[0]['id'])
    print(ds)