{"id":1790,"library":"xgrammar","title":"xgrammar: Structured Generation","description":"xgrammar is a Python library focused on efficient, flexible, and portable structured generation, primarily for Large Language Models (LLMs). It allows defining output constraints using grammars (e.g., JSON schema) to ensure LLM outputs conform to specific formats. It is under active development, with frequent minor releases providing new features and performance improvements.","status":"active","version":"0.1.33","language":"en","source_language":"en","source_url":"https://github.com/mlc-ai/xgrammar","tags":["LLM","grammar","structured generation","json","python","mlc-ai"],"install":[{"cmd":"pip install xgrammar","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Required for numerical operations, especially token mask generation.","package":"numpy","optional":false}],"imports":[{"symbol":"JsonSchemaConverter","correct":"from xgrammar.dfa import JsonSchemaConverter"},{"symbol":"Grammar","correct":"from xgrammar.dfa import Grammar"},{"symbol":"TagDispatch","correct":"from xgrammar.dfa import TagDispatch"}],"quickstart":{"code":"from xgrammar.dfa import JsonSchemaConverter, TagDispatch\nfrom xgrammar.tokenizer import get_tokenizer_and_vocabulary\nimport json\nimport os\n\n# For demonstration, use a placeholder tokenizer. In a real scenario, use a specific LLM tokenizer.\n# Ensure your tokenizer path is correct or specify a HuggingFace model_id.\n# Example: tokenizer_path = \"mistralai/Mistral-7B-Instruct-v0.2\"\n# Placeholder for a simple tokenizer for local testing without large model dependencies.\n# If running this with a real model, replace 'llama' with your model's family.\n# For a full setup, you might need to install 'mlc-llm' and use its tokenizer.\n\ndef create_mock_tokenizer_and_vocabulary():\n    # This is a mock function to allow the quickstart to run standalone.\n    # In a real use case, you'd load a proper tokenizer (e.g., from HuggingFace or MLC LLM).\n    class MockTokenizer:\n        def encode(self, text):\n            return [ord(c) for c in text]\n        def decode(self, tokens):\n            return ''.join([chr(t) for t in tokens])\n        def vocab_size(self):\n            return 256 # ASCII range\n    \n    # Simulate a basic vocabulary with common tokens\n    vocab = {chr(i): i for i in range(32, 127)} # Printable ASCII\n    vocab['\\n'] = 10\n    vocab[' '] = 32\n    return MockTokenizer(), vocab\n\n# Try to get a real tokenizer if environment variable is set\nMLC_LLM_MODEL_PATH = os.environ.get(\"MLC_LLM_MODEL_PATH\", \"\")\nif MLC_LLM_MODEL_PATH: # Not running this part if env var is not set, use mock\n    try:\n        tokenizer, vocabulary = get_tokenizer_and_vocabulary(tokenizer_path=MLC_LLM_MODEL_PATH, vocab_type=\"hf_llama\")\n    except Exception as e:\n        print(f\"Could not load tokenizer from MLC_LLM_MODEL_PATH: {e}. Using mock tokenizer.\")\n        tokenizer, vocabulary = create_mock_tokenizer_and_vocabulary()\nelse:\n    tokenizer, vocabulary = create_mock_tokenizer_and_vocabulary()\n\n# Define a JSON schema\njson_schema = {\n    \"type\": \"object\",\n    \"properties\": {\n        \"name\": {\"type\": \"string\"},\n        \"age\": {\"type\": \"integer\", \"minimum\": 0},\n        \"isStudent\": {\"type\": \"boolean\"}\n    },\n    \"required\": [\"name\", \"age\"]\n}\n\n# Create a JsonSchemaConverter\njson_converter = JsonSchemaConverter(json_schema)\n\n# Compile the grammar\ngrammar = TagDispatch(json_converter.grammar, tokenizer, vocabulary)\n\n# Example of generating a mask for a partial input (simulating LLM generation)\n# The mask would guide the LLM to only generate valid next tokens.\npartial_input_text = '{\"name\": \"Alice\", \"age\": 30, \"isStudent\": ' \nprint(f\"Partial input: {partial_input_text}\")\n\n# Get next token mask (in a real scenario, this would be passed to the LLM's sampler)\nnext_token_mask = grammar.get_mask(partial_input_text, use_cache=True)\n\n# For demonstration, we'll manually check some tokens\n# A valid token might be 't' for true or 'f' for false\nexample_token_ids = [tokenizer.encode('t')[0], tokenizer.encode('f')[0]] \n\nprint(f\"Is 't' (id={example_token_ids[0]}) allowed? {next_token_mask[example_token_ids[0]]}\")\nprint(f\"Is 'f' (id={example_token_ids[1]}) allowed? {next_token_mask[example_token_ids[1]]}\")\n\n# A number like '5' should not be allowed at this point\ninvalid_token_id = tokenizer.encode('5')[0]\nprint(f\"Is '5' (id={invalid_token_id}) allowed? {next_token_mask[invalid_token_id]}\")\n\n# Note: The actual mask values are usually 0 or 1 (or -inf/0 for logit masking).\n# For boolean values, the grammar would expect 'true' or 'false'.\n# The current partial_input_text expects the beginning of a boolean. \n# A subsequent call after generating 't' would then expect 'r', 'u', 'e'.","lang":"python","description":"This quickstart demonstrates how to define a JSON schema, convert it into an xgrammar `TagDispatch` object, and use it with a tokenizer to generate valid token masks for structured LLM output. It includes a mock tokenizer for standalone execution, but for real usage, a proper LLM tokenizer (e.g., from HuggingFace or MLC LLM) should be used."},"warnings":[{"fix":"Upgrade to version 0.1.31 or newer: `pip install --upgrade xgrammar`.","message":"Version 0.1.30 was officially yanked due to critical issues with `apply_token_bitmask_inplace` and Windows OS compatibility for crossing-grammar caching. Users on 0.1.30 are strongly advised to upgrade.","severity":"breaking","affected_versions":"0.1.30"},{"fix":"Stay up-to-date with GitHub release notes and test thoroughly after upgrades. Pin specific minor versions if stability is critical.","message":"xgrammar is under rapid development (pre-1.0.0), with frequent minor releases and internal refactors. While efforts are made for backward compatibility, minor API changes or behavioral shifts can occur. Always review release notes when upgrading.","severity":"gotcha","affected_versions":"<1.0.0"},{"fix":"Ensure the tokenizer used with `xgrammar.dfa.TagDispatch` or `Grammar` matches the tokenizer of your target LLM, including vocabulary and special tokens. Refer to the `mlc-ai/xgrammar` documentation for integration with MLC LLM models.","message":"The `xgrammar.tokenizer` module requires specific LLM tokenizers (e.g., from HuggingFace or MLC LLM) to operate correctly. Using a mismatching tokenizer or an incorrectly configured one will lead to invalid grammar masks or generation.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}