LlamaIndex OpenAI Pydantic Program
The `llama-index-program-openai` library is an integration for LlamaIndex that facilitates the generation of structured data using OpenAI's API in conjunction with Pydantic objects. It allows users to define explicit output schemas using Pydantic models, enabling robust and type-safe data extraction from LLM responses. Currently at version 0.3.2, this package follows the rapid release and modular development cadence of the broader LlamaIndex ecosystem.
Warnings
- breaking Beginning with LlamaIndex v0.10, the library adopted a modular package structure. `llama-index-program-openai` is now a standalone package, and core components are in `llama-index-core`. Ensure you install specific integration packages (e.g., `llama-index-llms-openai`) and adjust imports if migrating from older versions where all modules were consolidated under `llama_index`. The `ServiceContext` object was deprecated in v0.10 and fully removed in v0.11; use the `Settings` object or direct parameter passing instead.
- breaking LlamaIndex v0.11 fully migrated to Pydantic V2. If your project previously used `pydantic.v1` imports to work around compatibility issues, these should now be removed or updated to Pydantic V2 syntax.
- gotcha An `OPENAI_API_KEY` environment variable must be set for `llama-index-program-openai` to function, as it relies on OpenAI's API. Failure to set this will result in an `openai.error.AuthenticationError`.
- gotcha When using `OpenAIPydanticProgram` within agents that leverage OpenAI's function calling, tool descriptions have a maximum length of 1024 characters. Exceeding this limit will cause an error during tool construction.
Install
-
pip install llama-index-program-openai
Imports
- OpenAIPydanticProgram
from llama_index.program.openai import OpenAIPydanticProgram
- BaseModel
from pydantic import BaseModel
- List
from typing import List
Quickstart
import os
from pydantic import BaseModel
from typing import List
from llama_index.program.openai import OpenAIPydanticProgram
# Set your OpenAI API key (replace 'sk-...' or ensure it's in your environment)
os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY", "sk-YOUR_OPENAI_KEY_HERE")
# Define your desired structured output schema using Pydantic
class Song(BaseModel):
title: str
length_seconds: int
class Album(BaseModel):
name: str
artist: str
songs: List[Song]
# Define the prompt template for the LLM
prompt_template_str = """
Generate an example album, with an artist and a list of songs.
Using the movie {movie_name} as inspiration.
"""
# Initialize the OpenAI Pydantic program
program = OpenAIPydanticProgram.from_defaults(
output_cls=Album,
prompt_template_str=prompt_template_str,
verbose=True
)
# Run the program to get structured output
output = program(movie_name="The Shining")
# Print the structured output
print(output.json(indent=2))