Detoxify

0.5.2 verified Fri May 01 auth: no python

Detoxify is a Python library for detecting toxic comments using pre-trained transformer models. It provides a simple interface to classify text as toxic, severe toxic, obscene, threat, insult, identity hate, etc. The latest version is 0.5.2, with releases approximately every few months. It requires Python >=3.7 and supports Hugging Face transformers.

pip install detoxify

Common errors

error ModuleNotFoundError: No module named 'detoxify' ↓

cause The package is not installed or the module name is misspelled.

fix

Run 'pip install detoxify' in your environment.

error OSError: Can't load the model 'original' from HuggingFace. File not found or no internet. ↓

cause The model is not cached locally and there is no internet connection to download it.

fix

Ensure internet is available on first run, or pre-download the model using detoxify.download_model('original') while connected.

error AttributeError: module 'detoxify' has no attribute 'Detoxify' ↓

cause Incorrect import pattern: used 'import detoxify' instead of 'from detoxify import Detoxify'.

fix

Use 'from detoxify import Detoxify' to import the class.

Warnings

gotcha The default model 'original' may be outdated. For better performance, use 'unbiased' (small) or 'multilingual' (larger) models. See https://github.com/unitaryai/detoxify#available-models. ↓

fix Specify a model name explicitly: Detoxify('unbiased')

gotcha The output scores are not probabilities; they are raw logits or sigmoid outputs. Thresholds are applied by the library (score >= 0.5 is considered toxic). Do not interpret as calibrated probabilities. ↓

fix Use the "toxicity" score directly; the library already applies a threshold for classification if needed. Check the `predict` method's threshold parameter.

gotcha First run will download the model weights (~500 MB) from Hugging Face, which may take time and require an internet connection. ↓

fix Pre-download models using `detoxify.download_model('original')` before offline use, or set a local cache directory.

deprecated The 'original' model is based on a deprecated BERT checkpoint. It is no longer recommended for new projects; use 'unbiased' or 'multilingual' instead. ↓

fix Switch to Detoxify('unbiased') or Detoxify('multilingual').

Install

pip install detoxify[gpu]

Imports

Detoxify
wrong
```
import Detoxify
```
correct
```
from detoxify import Detoxify
```
Detoxify is a class inside the module, not the module itself.

Quickstart

Loads the 'original' toxic comment model and predicts toxicity scores for a sample sentence.

from detoxify import Detoxify

# Load the model (chooses 'original' by default)
model = Detoxify('original')

result = model.predict('This is a terrible, horrible example!')
print(result)
# Example output: {'toxicity': 0.9, 'severe_toxicity': 0.1, ...}