nh3 HTML Sanitizer

raw JSON →
0.3.4 verified Tue May 12 auth: no python install: verified quickstart: verified

nh3 is a Python binding to the Ammonia HTML sanitizer Rust crate, providing fast and configurable whitelist-based HTML sanitization. It is notably faster than pure-Python alternatives like `Bleach`, which it largely replaces since `html5lib` became unmaintained. The library is actively maintained, with its current version 0.3.4, and receives regular updates to both the Python bindings and its underlying Rust components.

pip install nh3
error ModuleNotFoundError: No module named 'nh3'
cause The 'nh3' library is not installed in your current Python environment, or there is a typo in the import statement.
fix
Install the library using pip: pip install nh3
error TypeError: 'module' object is not callable
cause You are attempting to call the 'nh3' module directly as a function (e.g., `nh3(html_string)`), instead of calling its `clean` function.
fix
Use the clean function within the module to sanitize HTML: nh3.clean(html_string)
error TypeError: Argument 'tags' must be set, not list
cause A parameter in `nh3.clean()`, such as `tags`, `clean_content_tags`, or `url_schemes`, was provided with an incorrect data type (e.g., a list instead of a set).
fix
Ensure arguments like tags, clean_content_tags, and url_schemes are passed as Python set objects, and attributes as a dictionary of set objects, as specified in the documentation. For example, nh3.clean(html_string, tags={'b', 'i'})
breaking nh3 is a modern replacement for the deprecated `Bleach` library. Users migrating from `Bleach` should note that while APIs are similar, configuration and underlying behavior may differ, requiring careful review of sanitization rules.
fix Review `nh3` documentation, especially for `tags`, `attributes`, and `url_schemes` parameters, to match your desired sanitization policy. Consider `nh3.Cleaner` for explicit, reusable configurations.
gotcha The default `nh3.clean()` function allows a relatively broad set of HTML tags (around 75) and attributes. This might be too permissive for many applications and could inadvertently allow potentially unsafe content if not explicitly restricted.
fix Always explicitly specify the `tags` and `attributes` you want to allow using the `nh3.clean()` function parameters or by configuring an `nh3.Cleaner` instance. For example, `nh3.clean(html, tags={'p', 'b', 'i'}, attributes={'a': {'href'}})` to define a strict whitelist.
breaking As `nh3` is a binding to the Rust `ammonia` crate, security vulnerabilities in the underlying Rust library can affect it. Regular updates are necessary to incorporate critical fixes. For example, recent updates addressed `RUSTSEC-2025-0071`.
fix Keep `nh3` up-to-date with the latest releases. Monitor the `nh3` GitHub repository and PyPI for security announcements and new versions.
gotcha When integrating with web frameworks like Django, simply calling `nh3.clean()` in templates or view logic may not be sufficient. Unsanitized data might still be saved to the database if sanitization isn't applied at the input (e.g., form field or model field) level.
fix Implement sanitization in Django form fields (e.g., in `to_python`) or consider using a dedicated package like `django-nh3` (if mature) that provides a sanitized model field. Always sanitize user-provided input *before* saving it to the database.
python os / libc status wheel install import disk
3.10 alpine (musl) wheel - 0.00s 20.6M
3.10 alpine (musl) - - 0.00s 20.6M
3.10 slim (glibc) wheel 1.6s 0.00s 21M
3.10 slim (glibc) - - 0.00s 21M
3.11 alpine (musl) wheel - 0.00s 22.4M
3.11 alpine (musl) - - 0.00s 22.4M
3.11 slim (glibc) wheel 1.6s 0.00s 22M
3.11 slim (glibc) - - 0.00s 22M
3.12 alpine (musl) wheel - 0.00s 14.3M
3.12 alpine (musl) - - 0.00s 14.3M
3.12 slim (glibc) wheel 1.5s 0.00s 14M
3.12 slim (glibc) - - 0.00s 14M
3.13 alpine (musl) wheel - 0.00s 14.0M
3.13 alpine (musl) - - 0.00s 13.9M
3.13 slim (glibc) wheel 1.6s 0.00s 14M
3.13 slim (glibc) - - 0.00s 14M
3.9 alpine (musl) wheel - 0.00s 20.1M
3.9 alpine (musl) - - 0.00s 20.1M
3.9 slim (glibc) wheel 1.9s 0.00s 20M
3.9 slim (glibc) - - 0.00s 20M

Sanitize an HTML fragment using the default configuration, then customize allowed tags, and finally demonstrate creating a reusable `Cleaner` instance with specific rules for tags, attributes, and URL schemes.

import nh3

# Basic sanitization
unclean_html = "<b><img src='' onerror='alert(\'hax\')'>XSS?</b><script>alert('malicious')</script><p>Hello</p>"
cleaned_html = nh3.clean(unclean_html)
print(f"Cleaned HTML: {cleaned_html}")

# Customizing allowed tags
custom_cleaned = nh3.clean(unclean_html, tags={"b", "p"})
print(f"Custom cleaned (b, p only): {custom_cleaned}")

# Using Cleaner for reusable configuration
my_cleaner = nh3.Cleaner(
    tags={"b", "i", "p"},
    attributes={
        "a": {"href"},
        "img": {"src"}
    },
    url_schemes={"http", "https"}
)
reusable_cleaned = my_cleaner.clean("<a href='javascript:alert(\"xss\")'>Click</a><b>Bold</b><i>Italic</i><script>evil</script>")
print(f"Reusable cleaner output: {reusable_cleaned}")