colabfit-kit
raw JSON → 0.0.10 verified Fri May 01 auth: no python
A Python library for loading, transforming, and managing training datasets for interatomic potentials (e.g., materials science, DFT data). Current version 0.0.10, requires Python >=3.10. Early-stage development, breaking changes are likely.
pip install colabfit-kit Common errors
error ModuleNotFoundError: No module named 'colabfit' ↓
cause Trying to import the top-level package (colabfit) instead of the submodule (colabfit.kit).
fix
Use
from colabfit.kit import Dataset instead of import colabfit. error ImportError: cannot import name 'Dataset' from 'colabfit.kit' ↓
cause Outdated version of colabfit-kit that doesn't have the Dataset class yet, or a conflicting installation.
fix
Upgrade to latest version:
pip install --upgrade colabfit-kit. If already latest, check for multiple installed packages with pip list | grep colabfit. error ValueError: Unsupported file format ↓
cause Attempting to load a file format that is not implemented (e.g., .pwmat, .vasp).
fix
Convert your dataset to extxyz format (e.g., using ASE:
from ase.io import write; write('data.extxyz', atoms_list)) and then use Dataset.from_file('data.extxyz', format='extxyz'). Warnings
breaking Alpha version – API is unstable. Expect breaking changes without deprecation warnings between minor versions. ↓
fix Pin to exact version and test after upgrades. Monitor GitHub releases.
gotcha The correct import path is `colabfit.kit`, not `colabfit`. Using `import colabfit` will import a different namespace (usually empty or a different package). ↓
fix Always use `from colabfit.kit import ...`.
gotcha File format support is limited. The `from_file` method may raise `ValueError` for unsupported formats. Known supported formats: extxyz, POSCAR, .xyz (with restrictions). ↓
fix Check documentation for supported formats or convert files to extxyz.
Imports
- Dataset wrong
from colabfit import Datasetcorrectfrom colabfit.kit import Dataset - ConfigurationSet
from colabfit.kit import ConfigurationSet
Quickstart
from colabfit.kit import Dataset
# Load an example dataset (replace with your file path)
ds = Dataset.from_file('path/to/extxyz', format='extxyz')
print(ds)
# Filter by property
ds_filtered = ds.filter({'energy': {'$exists': True}})
print(len(ds_filtered))