Strands Agents Evals

raw JSON →
0.1.16 verified Fri May 01 auth: no python

Evaluation framework for Strands agents. Current version 0.1.16, pre-1.0 rapid development; new releases weekly.

pip install strands-agents-evals
error ModuleNotFoundError: No module named 'strands_agents_evals'
cause Package installed as 'strands-agents-evals' but import uses hyphens.
fix
Install with pip install strands-agents-evals and import with from strands_agents_evals import ....
error TypeError: Evaluation.run() missing 1 required positional argument: 'dataset'
cause Since v0.1.10, dataset is required. Older code omits it.
fix
Provide dataset argument (list of inputs).
error AttributeError: module 'strands_agents_evals' has no attribute 'eval_function'
cause Attempting to access eval_function as a submodule (e.g., `import strands_agents_evals.eval_function`).
fix
Use from strands_agents_evals import eval_function.
breaking In v0.1.10, the Evaluation.run() signature changed: `dataset` is now required (previously optional).
fix Always provide a `dataset` argument.
deprecated The `score` parameter in metrics is deprecated since v0.1.12; use `scoring_fn` instead.
fix Replace `score=...` with `scoring_fn=...` in metric definitions.
gotcha Package name uses hyphens ('strands-agents-evals'), but Python imports use underscores ('strands_agents_evals'). Many users mistakenly import the hyphenated name.
fix Use `from strands_agents_evals import ...` (underscores).

Runs an evaluation using the default configuration.

from strands_agents_evals import Evaluation

result = Evaluation.run(
    agent_name="my_agent",
    dataset=["test input"],
    metrics=["accuracy"]
)
print(result)