GEPA AI Optimization Framework

0.1.1 · active · verified Thu Apr 09

GEPA (Genetic-Pareto) is a Python framework for optimizing textual system components like AI prompts, code snippets, and agent architectures. It employs LLM-based reflection and Pareto-efficient evolutionary search to improve performance against any evaluation metric. The library, currently at version 0.1.1, has an active development cadence with frequent releases, with recent focus on a universal API for optimizing any text parameter and enhanced visualization tools.

Warnings

Install

Imports

Quickstart

This quickstart optimizes a system prompt for math problems using GEPA's built-in AIME dataset. It demonstrates how to initialize a seed prompt and run the `gepa.optimize` function, which leverages LLM-based reflection to iteratively improve the prompt. Ensure relevant LLM API keys are set as environment variables (e.g., `OPENAI_API_KEY`) as GEPA uses LiteLLM.

import gepa
import os

# Set your LLM API key as an environment variable (e.g., in your shell or .env file)
# export OPENAI_API_KEY='your_openai_key_here'
# GEPA uses LiteLLM, supporting many providers. Adjust 'task_lm' and 'reflection_lm' accordingly.

# Load the AIME math dataset (built-in example)
trainset, valset, _ = gepa.examples.aime.init_dataset()

# Start with a basic prompt
seed_prompt = {
    "system_prompt": "You are a helpful assistant. Answer the question. "
    "Put your final answer in the format '### <answer>'"
}

# Optimize the prompt
# Ensure OPENAI_API_KEY or relevant API key is set in environment
result = gepa.optimize(
    seed_candidate=seed_prompt,
    trainset=trainset,
    valset=valset,
    task_lm=os.environ.get('GEPA_TASK_LM', 'openai/gpt-4o-mini'), # Model being optimized
    reflection_lm=os.environ.get('GEPA_REFLECTION_LM', 'openai/gpt-4o-mini'), # Model that generates improvements
    max_metric_calls=10 # Reduced for quick demo
)

print("\nOptimized prompt:", result.best_candidate['system_prompt'])
# Expected result shows improved accuracy (e.g., 46.6% -> 56.6% on AIME 2025 with GPT-4.1 Mini in full runs)

view raw JSON →