OpenML Python API
raw JSON → 0.15.1 verified Fri May 01 auth: no python
OpenML Python API for downloading, uploading, and managing datasets, tasks, runs, and flows on OpenML.org. Current version 0.15.1, requires Python >=3.8. Released under BSD license. Active development with periodic releases.
pip install openml Common errors
error openml.exceptions.OpenMLServerError: The request has not been authorized (401) ↓
cause Missing or invalid OpenML API key.
fix
Set API key via environment variable OPENML_API_KEY or openml.config.apikey = '...'
error AttributeError: 'int' object has no attribute 'get' (when calling get_dataset with wrong ID) ↓
cause Dataset ID does not exist or is not an integer. Function expects integer ID but may receive something else.
fix
Ensure dataset ID is valid integer. Use list_datasets() to check existing IDs.
error ModuleNotFoundError: No module named 'openml' ↓
cause OpenML not installed or installed in different environment.
fix
Run
pip install openml and ensure correct Python environment is active. Warnings
breaking In version 0.14, the `get_dataset` method changed from returning a tuple to returning a `OpenMLDataset` object. Calling `get_data()` now returns four values (X, y, categorical_indicator, attribute_names) instead of previously different structure. ↓
fix Update code to use `X, y, categorical, names = dataset.get_data(target=...)`.
deprecated `openml.datasets.list_datasets()` is deprecated; use `openml.datasets.get_dataset_list()` or pass parameters to `list_datasets()`. ↓
fix Replace `list_datasets()` with `get_dataset_list()` or use `list_datasets(size=...)` etc.
gotcha API key configuration requires either setting OPENML_API_KEY env var or creating a ~/.openml/config file. Without it, many functions raise OpenMLServerError (401). ↓
fix Set env var or file: `openml.config.apikey = 'your_apikey'`. Get key from openml.org.
Imports
- openml wrong
from openml import *correctimport openml - openml.datasets wrong
from openml.datasets import list_datasetscorrectfrom openml.datasets import get_dataset - openml.tasks wrong
import openml.tasks as taskscorrectfrom openml.tasks import get_task
Quickstart
import openml
openml.config.apikey = openml.config.get_api_key() # uses ~/.openml/config or environment
# List datasets
datasets = openml.datasets.list_datasets()
print(f"Number of datasets: {len(datasets)}")
# Download a dataset (iris)
dataset = openml.datasets.get_dataset(61)
X, y, categorical_names, attribute_names = dataset.get_data(target=dataset.default_target_attribute)
print(X.head())