colabfit-kit

raw JSON →
0.0.10 verified Fri May 01 auth: no python

A Python library for loading, transforming, and managing training datasets for interatomic potentials (e.g., materials science, DFT data). Current version 0.0.10, requires Python >=3.10. Early-stage development, breaking changes are likely.

pip install colabfit-kit
error ModuleNotFoundError: No module named 'colabfit'
cause Trying to import the top-level package (colabfit) instead of the submodule (colabfit.kit).
fix
Use from colabfit.kit import Dataset instead of import colabfit.
error ImportError: cannot import name 'Dataset' from 'colabfit.kit'
cause Outdated version of colabfit-kit that doesn't have the Dataset class yet, or a conflicting installation.
fix
Upgrade to latest version: pip install --upgrade colabfit-kit. If already latest, check for multiple installed packages with pip list | grep colabfit.
error ValueError: Unsupported file format
cause Attempting to load a file format that is not implemented (e.g., .pwmat, .vasp).
fix
Convert your dataset to extxyz format (e.g., using ASE: from ase.io import write; write('data.extxyz', atoms_list)) and then use Dataset.from_file('data.extxyz', format='extxyz').
breaking Alpha version – API is unstable. Expect breaking changes without deprecation warnings between minor versions.
fix Pin to exact version and test after upgrades. Monitor GitHub releases.
gotcha The correct import path is `colabfit.kit`, not `colabfit`. Using `import colabfit` will import a different namespace (usually empty or a different package).
fix Always use `from colabfit.kit import ...`.
gotcha File format support is limited. The `from_file` method may raise `ValueError` for unsupported formats. Known supported formats: extxyz, POSCAR, .xyz (with restrictions).
fix Check documentation for supported formats or convert files to extxyz.

Loads an extended XYZ file into a Dataset and filters entries that have an 'energy' property.

from colabfit.kit import Dataset
# Load an example dataset (replace with your file path)
ds = Dataset.from_file('path/to/extxyz', format='extxyz')
print(ds)
# Filter by property
ds_filtered = ds.filter({'energy': {'$exists': True}})
print(len(ds_filtered))