{"id":6749,"library":"openai-guardrails","title":"OpenAI Guardrails","description":"OpenAI Guardrails is a Python framework designed for building safe and reliable AI systems by adding configurable safety and compliance guardrails to LLM applications. It provides a drop-in wrapper for OpenAI's Python client, enabling automatic input/output validation and moderation using a wide range of built-in guardrails like content safety, data protection (e.g., PII detection), and content quality (e.g., hallucination detection). The library is actively maintained by OpenAI, with frequent releases, and is currently at version 0.2.1.","status":"active","version":"0.2.1","language":"en","source_language":"en","source_url":"https://github.com/openai/openai-guardrails-python","tags":["AI","LLM","safety","guardrails","OpenAI","moderation","PII"],"install":[{"cmd":"pip install openai-guardrails","lang":"bash","label":"Install core library"}],"dependencies":[{"reason":"Requires Python 3.11 or higher.","package":"python","optional":false},{"reason":"Many guardrail checks are model-based and incur standard OpenAI API costs, requiring an OpenAI API key.","package":"openai","optional":false}],"imports":[{"note":"For synchronous OpenAI client replacement.","symbol":"GuardrailsOpenAI","correct":"from guardrails import GuardrailsOpenAI"},{"note":"For asynchronous OpenAI client replacement.","symbol":"GuardrailsAsyncOpenAI","correct":"from guardrails import GuardrailsAsyncOpenAI"},{"note":"Exception raised when a guardrail detects a violation.","symbol":"GuardrailTripwireTriggered","correct":"from guardrails import GuardrailTripwireTriggered"}],"quickstart":{"code":"import os\nfrom pathlib import Path\nfrom guardrails import GuardrailsOpenAI, GuardrailTripwireTriggered\nfrom openai import OpenAI\n\n# Ensure your OpenAI API key is set as an environment variable (OPENAI_API_KEY)\n# or passed directly to the client.\n# For model-based guardrails, an API key is required.\n# To run this example, create a simple 'guardrails_config.json' file in the same directory:\n# {\"version\": \"1\", \"input\": {\"version\": \"1\", \"guardrails\": [{\"name\": \"Moderation\", \"config\": {}}]}}\n\ndef main():\n    # Initialize OpenAI client (standard or Guardrails client)\n    # The GuardrailsOpenAI client acts as a drop-in replacement\n    # It requires a config file (e.g., guardrails_config.json) that defines the guardrails to apply.\n    guardrails_client = GuardrailsOpenAI(config=Path(\"guardrails_config.json\"))\n\n    try:\n        # Use the Guardrails client just like a regular OpenAI client\n        response = guardrails_client.chat.completions.create(\n            model=\"gpt-3.5-turbo\",\n            messages=[\n                {\"role\": \"user\", \"content\": \"Hello, how are you?\"}\n            ]\n        )\n        print(\"LLM Output:\", response.choices[0].message.content)\n        # You can also access guardrail results if available\n        if hasattr(response, 'guardrail_results'):\n            print(\"Guardrail Results:\", response.guardrail_results)\n\n        # Example of triggering a moderation guardrail (if configured to block)\n        print(\"\\nTesting with potentially problematic input...\")\n        problematic_response = guardrails_client.chat.completions.create(\n            model=\"gpt-3.5-turbo\",\n            messages=[\n                {\"role\": \"user\", \"content\": \"I want to harm someone.\"}\n            ]\n        )\n        print(\"LLM Output (problematic):\", problematic_response.choices[0].message.content)\n\n    except GuardrailTripwireTriggered as e:\n        print(f\"\\nGuardrail triggered: {e.guardrail_result.info}\")\n        print(f\"Violation details: {e.guardrail_result.details}\")\n    except Exception as e:\n        print(f\"An unexpected error occurred: {e}\")\n\nif __name__ == \"__main__\":\n    # Set a dummy API key if not already set, for local testing without network calls (if guardrails config allows).\n    # For actual model-based guardrails, a valid API key is essential.\n    if not os.environ.get(\"OPENAI_API_KEY\"):\n        os.environ[\"OPENAI_API_KEY\"] = os.environ.get('TEST_OPENAI_API_KEY', 'sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxx')\n    main()","lang":"python","description":"This quickstart demonstrates how to integrate `openai-guardrails` by replacing the standard OpenAI client with a `GuardrailsOpenAI` instance. It highlights the use of a `guardrails_config.json` file to define guardrail logic and shows how to handle `GuardrailTripwireTriggered` exceptions when a violation occurs. A basic `guardrails_config.json` is provided as a comment for immediate testing."},"warnings":[{"fix":"Review and update code to directly access attributes of the underlying OpenAI response object returned by the Guardrails client methods (e.g., `response.choices[0].message.content`).","message":"In `v0.2.0`, the library changed to make the OpenAI response object directly accessible. This could affect how you access attributes (e.g., `response.output_text` or `response.choices[0].message.content`) if your code previously relied on wrapped access patterns.","severity":"breaking","affected_versions":">=0.2.0"},{"fix":"If PII masking was critical and relied on the removed Presidio integration, evaluate alternative PII detection/masking libraries or custom guardrails, or ensure your `guardrails_config.json` correctly handles PII without Presidio.","message":"In `v0.1.6`, the `Presidio anonymizer` dependency was removed due to conflicts. If your application relied on `openai-guardrails` for PII detection and masking via Presidio in versions prior to `v0.1.6`, this functionality might have changed or been removed, requiring alternative solutions or explicit dependency management.","severity":"breaking","affected_versions":">=0.1.6 (from 0.1.5 and earlier)"},{"fix":"Always ensure a `guardrails_config.json` file is present and correctly configured according to your desired guardrail logic. The official Guardrails Wizard (guardrails.openai.com) is recommended for configuration generation.","message":"The core functionality of `openai-guardrails` relies on a `guardrails_config.json` file, which defines the specific guardrails (e.g., moderation, PII detection, jailbreak detection) and their configurations. This file is loaded at client initialization but is external to the Python code examples, requiring manual creation or use of the Guardrails Wizard.","severity":"gotcha","affected_versions":"All"},{"fix":"Monitor your OpenAI API usage and costs, especially when enabling model-based guardrails or running extensive evaluations. Consider optimizing guardrail configurations or using non-LLM based checks where possible to manage costs.","message":"While the `openai-guardrails` library itself is open-source and free, many of its built-in guardrails (e.g., Hallucination Detection, Custom Prompt Check, Jailbreak) utilize OpenAI's own models and APIs. Consequently, these model-based checks will incur standard OpenAI API usage costs.","severity":"gotcha","affected_versions":"All"},{"fix":"Carefully design your guardrail placement in multi-agent workflows. For checks on intermediate steps or specific tool interactions, implement tool-level guardrails or custom logic within the agent's flow rather than solely relying on agent-level input/output guardrails.","message":"When integrating with OpenAI Agents SDK, agent-level guardrails have specific execution boundaries. Input guardrails run only for the *first* agent in a multi-agent chain, and output guardrails run only for the agent that produces the *final* output. This implies that intermediate agent interactions or specific tool calls might require tool-level guardrails for comprehensive coverage.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z","problems":[]}