{"id":4616,"library":"llama-index-question-gen-openai","title":"LlamaIndex OpenAI Question Generator","description":"The `llama-index-question-gen-openai` package provides an integration for LlamaIndex, enabling the generation of sub-questions using OpenAI's function calling API. It leverages the fine-tuned capabilities of the latest OpenAI models to output structured JSON objects, aiming to reduce output parsing issues compared to generic LLM question generators. The current version is 0.3.1, and it is part of the actively developed LlamaIndex ecosystem with frequent updates.","status":"active","version":"0.3.1","language":"en","source_language":"en","source_url":"https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/question_gen/llama-index-question-gen-openai","tags":["llama-index","openai","question-generation","LLM","RAG","function-calling"],"install":[{"cmd":"pip install llama-index-question-gen-openai","lang":"bash","label":"Install package"}],"dependencies":[{"reason":"This package is an integration within the LlamaIndex ecosystem and depends on core LlamaIndex abstractions.","package":"llama-index-core","optional":false},{"reason":"Provides the underlying API client for interacting with OpenAI models.","package":"openai","optional":false},{"reason":"Python version compatibility as specified in PyPI metadata.","package":"python","optional":false,"version_spec":">=3.9, <4.0"}],"imports":[{"note":"Following the LlamaIndex v0.10+ modularization, integrations are in their own namespaced packages, not directly under `llama_index.question_gen`.","wrong":"from llama_index.question_gen.base import OpenAIQuestionGenerator","symbol":"OpenAIQuestionGenerator","correct":"from llama_index.question_gen.openai import OpenAIQuestionGenerator"}],"quickstart":{"code":"import os\nfrom llama_index.question_gen.openai import OpenAIQuestionGenerator\nfrom llama_index.core.tools import ToolMetadata, QueryEngineTool\nfrom llama_index.core import QueryBundle, VectorStoreIndex, SimpleDirectoryReader\nfrom llama_index.llms.openai import OpenAI\n\n# Set your OpenAI API key (replace with your actual key or load from .env)\nos.environ[\"OPENAI_API_KEY\"] = os.environ.get(\"OPENAI_API_KEY\", \"YOUR_OPENAI_API_KEY\")\n\n# Create a dummy data file for demonstration\nwith open(\"data.txt\", \"w\") as f:\n    f.write(\"The capital of France is Paris. Paris is known for its Eiffel Tower.\")\n    f.write(\"The capital of Germany is Berlin. Berlin has a rich history.\")\n\n# Load data and create a simple index for a tool\nreader = SimpleDirectoryReader(\"./\")\ndocuments = reader.load_data()\nindex = VectorStoreIndex.from_documents(documents)\nquery_engine = index.as_query_engine()\n\n# Define a tool for the question generator to use\ntool = QueryEngineTool(\n    query_engine=query_engine,\n    metadata=ToolMetadata(\n        name=\"city_info\",\n        description=\"Provides information about cities and their landmarks.\"\n    ),\n)\n\n# Initialize OpenAIQuestionGenerator\n# It uses OpenAI's function calling API by default.\nquestion_gen = OpenAIQuestionGenerator.from_defaults(llm=OpenAI(model=\"gpt-3.5-turbo-0613\"))\n\n# Generate sub-questions based on a complex query and available tools\nquery_bundle = QueryBundle(\"Tell me about the capitals of European countries and their famous landmarks.\")\nsub_questions = question_gen.generate(\n    tools=[tool],\n    query=query_bundle\n)\n\nprint(f\"Generated {len(sub_questions)} sub-questions:\")\nfor sq in sub_questions:\n    print(f\"- Question: {sq.sub_question}, Tool: {sq.tool_name}\")\n","lang":"python","description":"This quickstart demonstrates how to initialize `OpenAIQuestionGenerator` and use it within a LlamaIndex application to generate sub-questions. It requires an `OPENAI_API_KEY` and sets up a basic `QueryEngineTool` that the generator can reference. The generated sub-questions are tailored to the provided tools and original query."},"warnings":[{"fix":"Ensure you are using a compatible OpenAI model (e.g., `gpt-3.5-turbo-0613` or newer) when initializing `OpenAIQuestionGenerator`.","message":"The `OpenAIQuestionGenerator` is specifically designed for OpenAI models that support the function calling API (e.g., `gpt-3.5-turbo-0613`, `gpt-4`). It will not work with older OpenAI completion-based models or other generic LLMs that do not support this API, which are typically handled by `LLMQuestionGenerator`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always `pip install` the specific `llama-index-*-*` integration package you intend to use (e.g., `pip install llama-index-question-gen-openai`).","message":"As of LlamaIndex v0.10.x, the library adopted a modular, namespaced package structure. This means integration packages like `llama-index-question-gen-openai` must be installed explicitly. Importing components directly from `llama_index` or `llama_index.core` for integrations that have moved to separate packages will result in `ImportError` if the specific integration package is not installed.","severity":"breaking","affected_versions":">=0.10.0"},{"fix":"Before running your application, ensure `OPENAI_API_KEY` is properly configured in your environment. For quick testing, you can directly set `os.environ[\"OPENAI_API_KEY\"]`.","message":"An `OPENAI_API_KEY` must be set as an environment variable (e.g., `export OPENAI_API_KEY='sk-...'` or via a `.env` file) for all OpenAI integrations, including `OpenAIQuestionGenerator`. Failure to do so will result in authentication errors when making API calls.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Shorten tool descriptions or consider moving extensive details into the prompt itself rather than solely relying on tool metadata. Evaluate if fewer, more broadly defined tools can achieve the same goal without hitting the limit.","message":"When using `OpenAIQuestionGenerator` within a `SubQuestionQueryEngine` with multiple tools, the combined descriptions of these tools can exceed OpenAI's function calling API character limit (currently 1024 characters). This will raise a `ValueError`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Implement robust error handling with `try-except` blocks around API calls. Consider using a retry mechanism with exponential backoff (e.g., via the `tenacity` library) to manage transient network issues and rate limits. Verify your API key and monitor usage in your OpenAI dashboard.","message":"Network issues, incorrect API keys, or exceeding rate limits can lead to `APIConnectionError` or `RateLimitError` when calling the OpenAI API. These are common with external API interactions and can disrupt workflows.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}