GEPA AI Optimization Framework
GEPA (Genetic-Pareto) is a Python framework for optimizing textual system components like AI prompts, code snippets, and agent architectures. It employs LLM-based reflection and Pareto-efficient evolutionary search to improve performance against any evaluation metric. The library, currently at version 0.1.1, has an active development cadence with frequent releases, with recent focus on a universal API for optimizing any text parameter and enhanced visualization tools.
Warnings
- breaking Version 0.1.0 introduced `optimize_anything` as a new, universal API for optimizing any text-representable artifact, including code and agent architectures. While `gepa.optimize` remains a valid entry point for prompt optimization, users migrating from pre-0.1.0 versions or seeking more general text optimization should consult the `optimize_anything` documentation.
- gotcha GEPA requires Python versions 3.10 or later, but strictly less than 3.15. Using incompatible Python versions will lead to installation or runtime errors.
- gotcha GEPA relies on LiteLLM for LLM interactions. You must set appropriate API keys for your chosen LLM provider (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`) as environment variables before running optimizations.
- gotcha Optimization runs, especially with large `max_metric_calls` or slower LLMs, can be time and resource-intensive. For initial experiments, reduce `max_metric_calls` (e.g., to 50), use faster task models, or reduce the training set size (e.g., to 20-30 examples).
Install
-
pip install gepa -
pip install gepa[full]
Imports
- optimize
import gepa ... gepa.optimize(...)
- optimize_anything
from gepa.optimize_anything import optimize_anything
Quickstart
import gepa
import os
# Set your LLM API key as an environment variable (e.g., in your shell or .env file)
# export OPENAI_API_KEY='your_openai_key_here'
# GEPA uses LiteLLM, supporting many providers. Adjust 'task_lm' and 'reflection_lm' accordingly.
# Load the AIME math dataset (built-in example)
trainset, valset, _ = gepa.examples.aime.init_dataset()
# Start with a basic prompt
seed_prompt = {
"system_prompt": "You are a helpful assistant. Answer the question. "
"Put your final answer in the format '### <answer>'"
}
# Optimize the prompt
# Ensure OPENAI_API_KEY or relevant API key is set in environment
result = gepa.optimize(
seed_candidate=seed_prompt,
trainset=trainset,
valset=valset,
task_lm=os.environ.get('GEPA_TASK_LM', 'openai/gpt-4o-mini'), # Model being optimized
reflection_lm=os.environ.get('GEPA_REFLECTION_LM', 'openai/gpt-4o-mini'), # Model that generates improvements
max_metric_calls=10 # Reduced for quick demo
)
print("\nOptimized prompt:", result.best_candidate['system_prompt'])
# Expected result shows improved accuracy (e.g., 46.6% -> 56.6% on AIME 2025 with GPT-4.1 Mini in full runs)