Agent Lightning
Agent Lightning is an open-source Microsoft framework designed to train and optimize AI agents using techniques like Reinforcement Learning, Automatic Prompt Optimization, and Supervised Fine-tuning. It works with various agent frameworks (e.g., LangChain, AutoGen) with minimal code changes. The current stable version is 0.3.0, and it maintains an active development cycle with regular updates and nightly builds.
Common errors
-
uv run errors with Permission denied under ~/.cache
cause Default `uv` cache locations might cause permission issues in certain environments, often related to user permissions or containerized setups.fixPrepend `UV_CACHE="$(pwd)/.cache_uv" XDG_CACHE_HOME="$(pwd)/.cache_xdg"` to your `uv run` command, for example: `UV_CACHE="$(pwd)/.cache_uv" XDG_CACHE_HOME="$(pwd)/.cache_xdg" uv run --no-sync pytest`. -
ModuleNotFoundError: No module named 'agentlightning.algorithm.verl'
cause The `verl` dependency is optional and needs to be explicitly installed if you plan to use VERL-based RL training.fixInstall the necessary optional dependencies: `pip install agentlightning[verl]` or `pip install verl` separately if you encounter this. -
RuntimeError: There is already an event loop running
cause When integrating `agentlightning` with other asynchronous frameworks or in certain interactive environments (like Jupyter notebooks), a new event loop might be created implicitly, conflicting with `agentlightning`'s async operations.fixUse `nest_asyncio.apply()` at the beginning of your script or notebook to allow nested asyncio event loops. Example: `import nest_asyncio; nest_asyncio.apply()`.
Warnings
- gotcha Agent Lightning is officially supported on Linux distributions (Ubuntu 22.04+ recommended). macOS and Windows (outside of WSL2) are currently not supported.
- breaking Nightly builds of Agent Lightning contain experimental features and may include unstable or untested changes, potentially leading to breaking changes.
- gotcha When using `uv` for dependency management, you might encounter `Permission denied` errors under `~/.cache`. This is a known issue with `uv`'s caching mechanism.
- breaking Changes in dependent libraries like `verl` or `weave` can cause incompatibilities or errors due to interface changes or breaking API updates.
Install
-
pip install agentlightning -
pip install --upgrade --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ --pre agentlightning
Imports
- rollout
from agentlightning import rollout
- LitAgent
from agentlightning import LitAgent
- Trainer
from agentlightning import Trainer
Quickstart
import os
from typing import TypedDict
from agentlightning import rollout, Trainer, PromptTemplate, NamedResources, Rollout
# Ensure OPENAI_API_KEY is set in your environment or replace 'YOUR_OPENAI_KEY'
os.environ.setdefault("OPENAI_API_KEY", os.environ.get('OPENAI_API_KEY', 'YOUR_OPENAI_KEY'))
class RoomSelectionTask(TypedDict):
attendee_count: int
time: str
has_whiteboard: bool
expected_choice: str
def room_selection_grader(final_choice: str, expected_choice: str) -> float:
"""Grades the agent's room selection."""
return 1.0 if final_choice == expected_choice else 0.0
@rollout
def room_selector_agent(task: RoomSelectionTask, prompt_template: PromptTemplate) -> float:
# Simulate agent logic using the prompt_template and task
# In a real scenario, this would involve LLM calls and tool usage.
prompt_text = prompt_template.format(task=task)
# Placeholder for LLM interaction and tool calls
# For demonstration, we'll simulate a choice.
if task['attendee_count'] > 5 and task['has_whiteboard']:
final_choice = 'Large Conference Room with Whiteboard'
else:
final_choice = 'Small Meeting Room'
reward = room_selection_grader(final_choice, task['expected_choice'])
return reward
# Define a simple prompt template (this would be optimized by Agent Lightning)
initial_prompt = PromptTemplate(
template="""You are a room selection agent. Given the following task:
Attendees: {task[attendee_count]}, Time: {task[time]}, Whiteboard needed: {task[has_whiteboard]}.
Select the best room."""
)
# Example usage with a dummy trainer (full training requires more setup)
# For a complete training loop, you would typically define a dataset and an algorithm.
# This snippet focuses on demonstrating the @rollout decorator.
# Create a dummy task for a single rollout demonstration
dummy_task = RoomSelectionTask(
attendee_count=7,
time="10 AM",
has_whiteboard=True,
expected_choice="Large Conference Room with Whiteboard"
)
# Manually run the agent with the initial prompt for demonstration
# In a real setup, Trainer would orchestrate this.
print(f"Running agent with task: {dummy_task}")
resources = NamedResources(prompt_template=initial_prompt)
reward = room_selector_agent(dummy_task, resources)
print(f"Agent received reward: {reward}")
# To initialize a trainer (requires more setup including an algorithm and dataset):
# trainer = Trainer(n_runners=1, algorithm=your_algorithm_instance)
# trainer.fit(agent=room_selector_agent, tasks=[dummy_task], resources={'prompt_template': initial_prompt})