DataLad
raw JSON → 1.4.1 verified Fri May 01 auth: no python
DataLad is a distributed system for joint management of code, data, and their relationships, built on top of Git and git-annex. Current version is 1.4.1, with frequent bug-fix releases. Requires Python >=3.10.
pip install datalad Common errors
error ImportError: cannot import name 'Dataset' from 'datalad' ↓
cause Dataset is not in datalad top-level; it's in datalad.api.
fix
Use 'from datalad.api import Dataset'
error AttributeError: module 'datalad' has no attribute 'api' ↓
cause Python 3.10+ may shadow with a local file named datalad.py.
fix
Rename any local script or module named 'datalad.py'.
error CommandNotFoundError: git-annex is not installed ↓
cause git-annex binary missing from PATH.
fix
Install git-annex via system package manager (e.g., apt-get install git-annex-standalone)
error requests.exceptions.InvalidURL: Failed to parse: ///example/dataset ↓
cause Triple-slash URLs are valid only with a configured remote; raw source URL needed.
fix
Use a valid git URL like 'https://github.com/datalad-datasets/longnow-podcasts.git'
Warnings
breaking Python 3.9 support dropped in DataLad 1.3.0; requires Python >=3.10. ↓
fix Upgrade Python to 3.10 or later.
breaking patoolib API change in version 2.0 broke DataLad <1.4.1; DataLad 1.4.1 includes a fix. ↓
fix Upgrade to DataLad 1.4.1 or pin patoolib<2.0.
gotcha DataLad uses custom markers for pytest; extensions must register them or use the DataLad pytest plugin introduced in 1.4.0. ↓
fix In conftest.py, use pytest_plugins = ['datalad.tests.fixtures'] to auto-register.
gotcha Operations on adjusted branches (crippled filesystems) may require extra merge steps; fixed in 1.3.3. ↓
fix Upgrade to 1.3.3+ or manually run git annex merge/adjust.
Imports
- datalad wrong
from datalad import apicorrectimport datalad - datalad.api wrong
import DataLadcorrectfrom datalad.api import Dataset - Dataset wrong
from datalad import Datasetcorrectfrom datalad.api import Dataset
Quickstart
from datalad.api import Dataset
ds = Dataset('.')
ds.install(source='///example/dataset')
print(ds.id)