Instructor
Structured data extraction from LLMs via Pydantic models. Patches or wraps provider clients (OpenAI, Anthropic, Gemini, Cohere, Mistral, Groq, Ollama, and 15+ others) to add response_model, automatic validation, and retry logic. Uses tool-calling or JSON mode depending on provider. Core interface: client.chat.completions.create(response_model=MyModel, ...) returns a validated Pydantic instance. Maintained by Jason Liu / jxnl.
Warnings
- breaking instructor.patch() removed in 1.0.0. All pre-1.0 code using instructor.patch(openai.OpenAI()) breaks with AttributeError.
- breaking Pydantic v1 not supported. instructor requires Pydantic v2.
- breaking Provider-specific extras required for non-OpenAI backends. Using instructor.from_anthropic() without pip install 'instructor[anthropic]' raises ImportError.
- breaking from_provider() requires 'provider/model' string format. Bare model names raise ValueError.
- gotcha Each provider uses a different default Mode (tool-calling vs JSON). Anthropic uses ANTHROPIC_TOOLS by default; OpenAI uses TOOLS. Mixing providers without checking mode compatibility causes silent output degradation or errors.
- gotcha max_retries controls validation retry attempts, not HTTP retries. Default is 1 retry on Pydantic validation failure. Complex schemas with small models frequently exhaust retries silently and raise InstructorRetryException.
- gotcha Streaming with create_partial() returns Partial[T] objects where fields are None until generated. Accessing fields before the stream completes returns None — not an error.
- gotcha openai is a required dependency even when using non-OpenAI providers (Anthropic, Gemini, etc.). This is by design — instructor proxies through OpenAI's interface structure.
Install
-
pip install instructor -
pip install 'instructor[anthropic]' -
pip install 'instructor[google-genai]' -
pip install 'instructor[groq]' -
pip install 'instructor[litellm]'
Imports
- from_openai
import instructor; import openai; client = instructor.from_openai(openai.OpenAI())
- from_provider
import instructor; client = instructor.from_provider('openai/gpt-4o')
Quickstart
import instructor
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
# Unified provider interface (1.x recommended)
client = instructor.from_provider('openai/gpt-4o-mini')
user = client.chat.completions.create(
response_model=User,
messages=[{'role': 'user', 'content': 'John is 25 years old'}],
)
print(user) # User(name='John', age=25)