Guidance
Microsoft-backed constrained generation framework for LLMs. Programs interleave control flow with generation via lm += gen(...) syntax. Supports regex, CFGs, JSON schema, and select() constraints using a Rust-based llguidance engine. Backends: Transformers, llama.cpp, OpenAI, Azure AI. Model objects are immutable — each += produces a copy.
Warnings
- breaking guidance.llms namespace removed. All pre-0.1.x code using guidance.llms.OpenAI() or guidance.llm.OpenAI() raises AttributeError.
- breaking bench module removed in 0.2.x. from guidance import bench raises ImportError.
- breaking @guidance decorator requires explicit stateless=True for pure grammar composition functions that do not call gen() internally.
- breaking llama-cpp-python version must match guidance pin (currently >=0.3.12). Mismatched versions cause AttributeError or silent failures on LlamaCpp model init.
- gotcha Model objects are immutable. lm += ... does not mutate in place — it returns a new copy. Not storing the result silently discards generated output.
- gotcha pip install guidance does not install any inference backend. Importing Transformers or LlamaCpp without the backend raises ImportError at runtime.
- gotcha OpenAI and Azure backends do not support token-level constrained decoding (regex/CFG). Constraints fall back to prompt-based steering only. Full constrained generation requires a local backend.
Install
-
pip install guidance -
pip install guidance[transformers] -
pip install guidance[llamacpp]
Imports
- models.OpenAI
from guidance.models import OpenAI
- Transformers
from guidance.models import Transformers
Quickstart
from guidance import system, user, assistant, gen
from guidance.models import Transformers
lm = Transformers('microsoft/Phi-4-mini-instruct')
with system():
lm += 'You are a helpful assistant'
with user():
lm += 'What is the capital of France?'
with assistant():
lm += gen(name='answer', max_tokens=20)
print(lm['answer'])