Valohai Utils

raw JSON →
0.7.0 verified Mon Apr 27 auth: no python

Utilities for building and running machine learning pipelines on the Valohai platform. Version 0.7.0, stable but occasional updates for Valohai API changes.

pip install valohai-utils
error ModuleNotFoundError: No module named 'valohai'
cause The import is 'valohai' not 'valohai_utils'. The package is installed as valohai-utils but import uses valohai.
fix
Install valohai-utils (pip install valohai-utils) then use 'import valohai'.
error KeyError: 'dataset' when accessing ctx.inputs['dataset']
cause The input name in code does not match the key defined in valohai.yaml inputs section.
fix
Ensure the input key in valohai.yaml matches exactly: e.g., inputs: [name: dataset] then code uses ctx.inputs['dataset'].
breaking In v0.5+, task_context no longer auto-uploads outputs; you must explicitly call ctx.upload_outputs() or use write_output which auto-uploads.
fix After writing outputs, call ctx.upload_outputs() or use ctx.write_output() which handles upload.
gotcha Inputs are accessed via comma-separated values in the pipeline YAML. If you define an input as 'dataset', you must use ctx.inputs['dataset'] — case-sensitive and matches exactly the YAML key.
fix Use ctx.inputs['input_name'] exactly as defined in valohai.yaml under inputs:.
gotcha The local execution mode does not validate Valohai API tokens; scripts that work locally may fail on Valohai due to missing environment variables.
fix Always test on Valohai or use valohai-utils in an environment with VALOHAI_TOKEN set.
deprecated valohai.prepare_execution() is deprecated. Use valohai.initialize() or task_context instead.
fix Replace valohai.prepare_execution() with valohai.initialize() and use task_context.

Minimal Valohai pipeline step using task_context for input/output management.

import valohai
from valohai import task_context

if __name__ == '__main__':
    with task_context() as ctx:
        # Access input files
        for file_path in ctx.inputs['dataset']:
            print(f"Processing {file_path}")
        
        # Write outputs
        for output_name in ctx.outputs:
            ctx.write_output(output_name, data=[], output_path='/some/path')