LM Format Enforcer
LM Format Enforcer is a Python library designed to constrain the output of large language models (LLMs) to specific formats like JSON Schema or Regular Expressions. It integrates with popular LLM frameworks such as Hugging Face Transformers and vLLM. The current version is 0.11.3, and it typically releases minor updates frequently to support new integrations or fix compatibility issues.
Warnings
- breaking The return type of `TokenEnforcer.get_allowed_tokens()` changed in v0.11.1 to be torch tensor bitmask based for vLLM V1 integration. This is a breaking change if you directly call or rely on the return type of this internal function.
- gotcha The primary user-facing API for integrating `lm-format-enforcer` with LLMs (e.g., Transformers, vLLM) involves specific `build_..._prefix_allowed_tokens_fn` functions, rather than directly instantiating a generic `LMFormatEnforcer` class. This is a common point of confusion for new users.
- gotcha The library has specific Python version requirements (`>=3.8, <4.0`). Using unsupported Python versions may lead to unexpected errors or installation issues.
- gotcha Compatibility with `transformers` and `pydantic` libraries can be strict. Older versions of `transformers` (pre-4.38.0) and `pydantic` (pre-2.0.0) might cause issues or not be fully supported.
- gotcha When integrating with vLLM, ensure your vLLM version is compatible with the `lm-format-enforcer` version. Specific vLLM versions (e.g., vLLM V1) may require particular `lm-format-enforcer` versions and features like the `use_bitmask` flag.
Install
-
pip install lm-format-enforcer -
pip install lm-format-enforcer[vllm]
Imports
- JsonSchemaParser
from lm_format_enforcer.json_schema_parser import JsonSchemaParser
- RegexParser
from lm_format_enforcer.regex_parser import RegexParser
- build_transformers_prefix_allowed_tokens_fn
from lm_format_enforcer.integrations.transformers import build_transformers_prefix_allowed_tokens_fn
- build_vllm_prefix_allowed_tokens_fn
from lm_format_enforcer.integrations.vllm import build_vllm_prefix_allowed_tokens_fn
Quickstart
from transformers import AutoTokenizer, AutoModelForCausalLM
from lm_format_enforcer.json_schema_parser import JsonSchemaParser
from lm_format_enforcer.integrations.transformers import build_transformers_prefix_allowed_tokens_fn
import torch
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")
# Define the JSON schema
json_schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer", "minimum": 0},
"isStudent": {"type": "boolean"}
},
"required": ["name", "age", "isStudent"]
}
# Create the parser
json_parser = JsonSchemaParser(json_schema)
# Build the prefix_allowed_tokens_fn for transformers integration
prefix_allowed_tokens_fn = build_transformers_prefix_allowed_tokens_fn(tokenizer, json_parser)
prompt = "Please generate a JSON object describing a person with name, age, and student status:\n"
# Encode the prompt
input_ids = tokenizer.encode(prompt, return_tensors="pt")
# Generate text with format enforcement
# GPT2 might not perfectly follow instructions but the *format* will be enforced.
output = model.generate(
input_ids,
max_new_tokens=100,
prefix_allowed_tokens_fn=prefix_allowed_tokens_fn,
pad_token_id=tokenizer.eos_token_id,
do_sample=False, # For deterministic generation where possible
num_beams=1
)
# Decode and print the result
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
# Example output (format enforced):
# {"name": "Alice", "age": 25, "isStudent": true}