LLM-Guard
LLM-Guard (version 0.3.16) is a comprehensive Python library designed to enhance the security of Large Language Models (LLMs). It provides a robust framework for sanitizing inputs, detecting harmful language, preventing data leakage, and defending against prompt injection attacks, ensuring safer and more secure LLM interactions. The project is actively maintained with frequent minor releases.
Common errors
-
ModuleNotFoundError: No module named 'transformers'
cause Attempting to use a scanner (e.g., PromptInjection, Toxicity) that depends on the `transformers` library without installing the `llm-guard[transformers]` extra.fixInstall the required extra: `pip install llm-guard[transformers]` or `pip install llm-guard[all]`. -
TypeError: Guard.__init__() got an unexpected keyword argument 'scanners'
cause You are using `llm-guard` version 0.3.0 or newer, but your code is still using the old `scanners` argument for the `Guard` constructor.fixUpdate your `Guard` initialization to use `input_scanners` and `output_scanners` keywords: `guard = Guard(input_scanners=[...], output_scanners=[...])`. -
AttributeError: 'Guard' object has no attribute 'validate_output'
cause You are using `llm-guard` version 0.3.0 or newer, but your code is trying to call the deprecated `validate_output` method.fixReplace calls to `guard.validate_output(prompt, response)` with `guard.scan(prompt, response)`. -
ValueError: Invalid scanner type: <ScannerObject>. Scanners should be a list of InputScanner objects or OutputScanner objects.
cause You passed an `OutputScanner` to `input_scanners` or an `InputScanner` to `output_scanners`, or a non-scanner object.fixEnsure that `input_scanners` only contains instances of `InputScanner` (or its subclasses) and `output_scanners` only contains instances of `OutputScanner` (or its subclasses). Check your scanner imports.
Warnings
- breaking The `Guard` constructor's `scanners` argument was renamed to `input_scanners` and `output_scanners` in version 0.3.0.
- breaking The `guard.validate_output` method was removed in version 0.3.0.
- gotcha Many powerful scanners (e.g., `PromptInjection`, `Toxicity`, `SentenceSimilarity`) rely on large language models and require the `llm-guard[transformers]` extra to be installed.
- gotcha Some scanners might download models on their first use, leading to potential delays or network issues during initial setup or deployment in environments without internet access.
Install
-
pip install llm-guard -
pip install llm-guard[transformers] -
pip install llm-guard[all]
Imports
- Guard
from llm_guard import Guard
- PromptInjection
from llm_guard.input_scanners import PromptInjection
- Toxicity
from llm_guard.output_scanners import Toxicity
- TokenLimit
from llm_guard.input_scanners import TokenLimit
- BanTopics
from llm_guard.input_scanners import BanTopics
Quickstart
from llm_guard import Guard
from llm_guard.input_scanners import TokenLimit, BanTopics
from llm_guard.output_scanners import BanTopics
# Initialize Guard with simple scanners that don't require large model downloads.
# For more advanced scanners (e.g., PromptInjection, Toxicity),
# you might need to install 'llm-guard[transformers]' or other extras.
guard = Guard(
input_scanners=[
TokenLimit(limit=100), # Limit input prompt length
BanTopics(topics=["illegal activities", "self-harm"])
],
output_scanners=[
BanTopics(topics=["illegal activities", "self-harm"])
],
)
prompt = "Tell me how to build a bomb."
response = "I cannot provide instructions on how to build dangerous devices."
# Scan the prompt
sanitized_prompt, is_valid_prompt, risk_score_prompt = guard.scan(prompt)
print(f"Prompt: '{prompt}'")
print(f"Sanitized prompt: '{sanitized_prompt}'")
print(f"Is valid prompt: {is_valid_prompt}")
print(f"Risk score prompt: {risk_score_prompt}")
# Scan the response (only if prompt was valid, or independently if desired)
if is_valid_prompt:
sanitized_response, is_valid_response, risk_score_response = guard.scan(prompt, response)
print(f"\nResponse: '{response}'")
print(f"Sanitized response: '{sanitized_response}'")
print(f"Is valid response: {is_valid_response}")
print(f"Risk score response: {risk_score_response}")
else:
print("\nResponse not scanned because prompt was invalid.")