Eval Protocol Python SDK
raw JSON → 0.3.29 verified Fri May 01 auth: no python
Official Python SDK for Eval Protocol (EP), an open protocol standardizing how developers author evals for LLM applications. Current version 0.3.29, under active development.
pip install eval-protocol Common errors
error ModuleNotFoundError: No module named 'evalprotocol' ↓
cause Import uses 'evalprotocol' but correct module is 'eval_protocol'.
fix
Change import to: from eval_protocol import ...
error TypeError: EvalClient.__init__() missing 1 required positional argument: 'api_key' ↓
cause In version 0.3+, api_key must be passed as keyword argument.
fix
Use EvalClient(api_key='your_key') instead of EvalClient('your_key').
error pydantic.error_wrappers.ValidationError: 1 validation error for Episode input field required (type=value_error.missing) ↓
cause Episode object created without the required 'input' field.
fix
Always include 'input' and 'expected' when creating Episode.
Warnings
breaking SDK version 0.3.x changed the API client initialization. Previously required positional api_key; now requires keyword argument. ↓
fix Use EvalClient(api_key='your_key') instead of EvalClient('your_key').
gotcha The import module is 'eval_protocol' (with underscore), not 'evalprotocol' or 'eval-protocol'. Hyphen cannot be used in Python imports. ↓
fix Use 'from eval_protocol import ...'.
gotcha Episode fields 'input' and 'expected' are required; omitting them causes runtime validation errors. ↓
fix Always provide both 'input' and 'expected' when creating an Episode.
Imports
- Episode wrong
from evalprotocol import Episodecorrectfrom eval_protocol import Episode - EvalClient
from eval_protocol import EvalClient - Result
from eval_protocol import Result
Quickstart
from eval_protocol import EvalClient, Episode, Result
import os
api_key = os.environ.get('EVAL_PROTOCOL_API_KEY', '')
client = EvalClient(api_key=api_key)
eval_id = client.create_eval(name='My Quick Eval')
episode = Episode(input='What is 2+2?', expected='4')
client.log_episode(eval_id=eval_id, episode=episode)
result = client.run_eval(eval_id=eval_id)
print(result)