Valohai YAML Parser and Validator
valohai-yaml is a Python library for parsing, validating, and programmatically constructing `valohai.yaml` configuration files, which define machine learning pipelines and experiments for the Valohai platform. It ensures YAML configurations adhere to the Valohai schema. The current version is 0.56.0, and it maintains a frequent release cadence, often with minor updates and new feature support aligning with the Valohai platform.
Common errors
-
ModuleNotFoundError: No module named 'valohai_yaml'
cause The `valohai-yaml` library is not installed in the current Python environment.fixRun `pip install valohai-yaml` to install the package. -
valohai_yaml.errors.ValidationError: [YAML is not valid] ...
cause The provided YAML content does not conform to the Valohai YAML schema, or contains structural errors.fixCarefully review the YAML content against the Valohai documentation and the error messages provided by the `ValidationError` to identify and correct the invalid syntax or structure. Use `valohai_yaml.lint` for detailed validation reports. -
yaml.parser.ParserError: while parsing a block mapping
cause The input string contains malformed YAML syntax that even the underlying PyYAML parser cannot process, typically due to incorrect indentation, missing colons, or invalid character sequences.fixCorrect the YAML syntax. Pay close attention to indentation and standard YAML formatting rules. Tools like online YAML validators can help identify basic syntax errors. -
AttributeError: 'NoneType' object has no attribute 'steps'
cause This usually happens when `valohai_yaml.parse()` fails due to invalid YAML, returns `None` or raises an exception, and subsequent code attempts to access attributes (like `steps`) on the non-existent or unhandled parsed object.fixAlways wrap `parse()` calls in a `try...except valohai_yaml.errors.ValidationError` block, or check the return value to ensure it's not `None` before attempting to access its attributes. Ensure the YAML input is valid.
Warnings
- breaking Support for Python 3.8 was dropped in version 0.47.0. Users on Python 3.8 will encounter installation or runtime errors.
- breaking In version 0.47.0, the linter was updated to flag duplicate top-level entity names (e.g., two steps with the same name) as errors instead of warnings. Previously invalid configurations may now fail validation.
- gotcha Version 0.56.0 changed serialization behavior to not serialize empty edge configs or default node error actions. If you rely on programmatic serialization to regenerate exact YAML structures, this might result in differences.
- gotcha The `valohai-yaml` library exports various `Items` and `Enums` directly from the top-level package since v0.56.0 for convenience. While importing from submodules (e.g., `valohai_yaml.yaml.parse`) still works, the recommended path is direct import (`from valohai_yaml import parse`).
Install
-
pip install valohai-yaml
Imports
- parse
from valohai_yaml.yaml import parse
from valohai_yaml import parse
- lint
from valohai_yaml.lint import lint
from valohai_yaml import lint
- build_config
from valohai_yaml.build import build_config
from valohai_yaml import build_config
Quickstart
from valohai_yaml import parse
valohai_yaml_content = '''
- step:
name: training-step
image: python:3.9
command: python train.py
inputs:
- name: dataset
default: s3://my-bucket/data/dataset.csv
outputs:
- name: model
default: model.pkl
'''
config = parse(valohai_yaml_content)
print(f"Parsed Valohai config with {len(config.steps)} step(s).")
print(f"First step name: {config.steps[0].name}")
# Example of linting (requires more complex setup usually)
# from valohai_yaml import lint
# errors = lint(config)
# if errors: print(f"Linting errors: {errors}")