Dora Search
Dora Search is an experiment management tool developed by Facebook Research, designed to simplify grid searches and hyperparameter tuning for machine learning projects. It helps organize experiments, log results, and ensure reproducibility. Currently at version 0.1.12, it has an active development pace with frequent minor updates, focusing on robust experiment tracking and launch capabilities.
Warnings
- gotcha The `xp.run()` function is primarily for quick local testing and does not fully leverage Dora's experiment management capabilities (like structured output directories, reproducibility, or distributed launching). For full features and robust experiment tracking, `dora.main()` is the recommended entry point.
- gotcha The experiment function passed to `dora.main` (or `xp.run`) must accept exactly one argument, which will be a dictionary containing the current experiment's configuration, including Dora's internal parameters (e.g., `_xp_id`, `_xp_dir`).
- gotcha While Dora manages experiment parameters and results, it does not automatically manage the specific Python environment (dependencies, versions) used by your experiment code. Reproducibility often depends on consistent external environments.
- gotcha Dora uses configuration keys prefixed with an underscore (e.g., `_xp_dir`, `_xp_id`) for its internal parameters. Defining your own grid arguments with keys starting with `_` can lead to conflicts or unexpected behavior.
Install
-
pip install dora-search
Imports
- xp
import dora.xp as xp
- main
import dora
- dora
import dora
Quickstart
import dora
import dora.xp as xp
import time
import os
from typing import Dict
def train_model(config: Dict):
"""Simulates a training run with a given configuration."""
print(f"[Experiment {config.get('_xp_id', 'local')}] Running with config: {config}")
# Simulate some work
time.sleep(0.1)
result = config["lr"] * config["batch_size"]
print(f"[Experiment {config.get('_xp_id', 'local')}] Result: {result}")
# Dora automatically saves results in the experiment directory
xp_dir = config.get('_xp_dir')
if xp_dir:
os.makedirs(xp_dir, exist_ok=True)
with open(os.path.join(xp_dir, 'metrics.json'), 'w') as f:
import json
json.dump({'loss': result, 'accuracy': 1 - result / 100}, f)
return {"loss": result, "accuracy": 1 - result / 100}
# Define the grid search parameters
grid = [
xp.arg("lr", [0.01, 0.1, 1.0]),
xp.arg("batch_size", [16, 32])
]
if __name__ == "__main__":
print("Launching Dora experiments...")
# dora.main is the recommended way to launch experiments
# It will iterate through the grid, call train_model for each configuration,
# and manage experiment directories and logs.
dora.main(train_model, grid=grid)
print("Dora experiments finished.")