LlamaIndex OpenAI Pydantic Program

0.3.2 · active · verified Sun Apr 12

The `llama-index-program-openai` library is an integration for LlamaIndex that facilitates the generation of structured data using OpenAI's API in conjunction with Pydantic objects. It allows users to define explicit output schemas using Pydantic models, enabling robust and type-safe data extraction from LLM responses. Currently at version 0.3.2, this package follows the rapid release and modular development cadence of the broader LlamaIndex ecosystem.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `OpenAIPydanticProgram` to extract structured data. It defines a Pydantic `Album` schema, constructs a prompt using a template, and then executes the program with an OpenAI LLM to populate the `Album` object based on the given inspiration. Ensure your `OPENAI_API_KEY` environment variable is set.

import os
from pydantic import BaseModel
from typing import List
from llama_index.program.openai import OpenAIPydanticProgram

# Set your OpenAI API key (replace 'sk-...' or ensure it's in your environment)
os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY", "sk-YOUR_OPENAI_KEY_HERE")

# Define your desired structured output schema using Pydantic
class Song(BaseModel):
    title: str
    length_seconds: int

class Album(BaseModel):
    name: str
    artist: str
    songs: List[Song]

# Define the prompt template for the LLM
prompt_template_str = """
Generate an example album, with an artist and a list of songs.
Using the movie {movie_name} as inspiration.
"""

# Initialize the OpenAI Pydantic program
program = OpenAIPydanticProgram.from_defaults(
    output_cls=Album,
    prompt_template_str=prompt_template_str,
    verbose=True
)

# Run the program to get structured output
output = program(movie_name="The Shining")

# Print the structured output
print(output.json(indent=2))

view raw JSON →