Eido: Project Metadata Validator
Eido (pronounced 'eye-doe') is a Python library and CLI tool designed for validating project metadata, primarily within the Portable Encapsulated Project (PEP) framework. It leverages JSON Schema to ensure the correctness and consistency of project configuration files and sample annotations. The current version is 0.2.5, and releases are relatively infrequent, often tied to updates in the broader PEPKit ecosystem.
Common errors
-
jsonschema.exceptions.ValidationError: '...' is not valid under any of the given schemas
cause The data in your project configuration or sample table does not conform to the specified JSON schema. This error is directly from the underlying `jsonschema` library, which `eido` uses.fixCarefully compare your project's `config.yaml` and `sample_table.csv` (or equivalent) against the `project_schema.yaml`. Pay attention to data types, required fields, enum values, and allowed patterns. The error message often includes details about the specific failing validation condition. -
FileNotFoundError: [Errno 2] No such file or directory: 'your_schema.yaml'
cause The path provided for your project configuration, sample table, or schema file is incorrect or the file does not exist at the specified location.fixDouble-check the file paths passed to `peppy.Project()` and `eido.validate_project()`. Ensure the files exist and are accessible from where your script is running. Use absolute paths or verify your current working directory. -
AttributeError: 'NoneType' object has no attribute 'get' (or similar error when accessing project attributes)
cause This usually indicates that `peppy.Project()` failed to load your project configuration correctly, resulting in a `Project` object that is either `None` or incomplete. `eido` then tries to access non-existent attributes of this invalid project object.fixInspect your `config.yaml` for syntax errors (e.g., incorrect YAML formatting, missing required sections). Ensure `peppy` can parse your project configuration independently before passing it to `eido`. Check `peppy`'s documentation for valid project structures and common parsing issues.
Warnings
- breaking In `v0.2.3`, several internal keys/attributes used in schema definition or project configuration were renamed: `files_key` to `sizing`, `required_files_key` to `tangible_key`, and `_samples` to `samples`. This may break existing PEP configurations or custom schemas.
- breaking The `--exclude-case` option was removed from the CLI in `v0.2.2`. Users relying on this command-line argument will find their scripts failing.
- gotcha Eido is primarily designed to validate `peppy.Project` and `peppy.Sample` objects. While it uses JSON Schema internally, directly passing arbitrary Python dictionaries to `validate_project` or `validate_sample` will likely fail unless the dictionary mimics the structure of a `peppy` object.
Install
-
pip install eido
Imports
- validate_project
from eido import validate_project
- validate_sample
from eido import validate_sample
- validate_config
from eido import validate_config
Quickstart
import os
import peppy
import eido
import yaml
from jsonschema import ValidationError
# Create a dummy project config file
config_content = """
project_name: MyTestProject
output_dir: output/
samples:
- sample_name: sample1
file_path: data/sample1.txt
organism: human
- sample_name: sample2
file_path: data/sample2.txt
organism: mouse
"""
# Create a dummy schema file
schema_content = """
type: object
properties:
project_name: { type: string }
output_dir: { type: string }
samples:
type: array
items:
type: object
properties:
sample_name: { type: string }
file_path: { type: string }
organism: { type: string, enum: [human, mouse, rat] }
required: [sample_name, file_path, organism]
required: [project_name, samples]
"""
config_file = "_eido_test_config.yaml"
schema_file = "_eido_test_schema.yaml"
try:
# Write dummy files
with open(config_file, 'w') as f:
f.write(config_content)
with open(schema_file, 'w') as f:
f.write(schema_content)
# Load the project using peppy
project = peppy.Project(config_file)
# Validate the project using eido
print(f"Attempting to validate project '{project.name}'...")
eido.validate_project(project, schema_file)
print("Project validated successfully!")
# Example of validation failure (optional, for demonstration)
print("\n--- Demonstrating a validation failure ---")
bad_config_content = """
project_name: AnotherProject
samples:
- sample_name: invalid_sample
organism: alien # 'alien' is not in enum
"""
bad_config_file = "_eido_bad_config.yaml"
with open(bad_config_file, 'w') as f:
f.write(bad_config_content)
bad_project = peppy.Project(bad_config_file)
try:
print("Attempting to validate a project with invalid organism...")
eido.validate_project(bad_project, schema_file)
print("Unexpected: Validation passed for bad project!")
except ValidationError as e:
print(f"Caught expected validation error: {e.message}")
finally:
# Clean up dummy files
if os.path.exists(config_file):
os.remove(config_file)
if os.path.exists(schema_file):
os.remove(schema_file)
if os.path.exists(bad_config_file):
os.remove(bad_config_file)