nh3 HTML Sanitizer

0.3.4 · active · verified Sat Mar 28

nh3 is a Python binding to the Ammonia HTML sanitizer Rust crate, providing fast and configurable whitelist-based HTML sanitization. It is notably faster than pure-Python alternatives like `Bleach`, which it largely replaces since `html5lib` became unmaintained. The library is actively maintained, with its current version 0.3.4, and receives regular updates to both the Python bindings and its underlying Rust components.

Warnings

Install

Imports

Quickstart

Sanitize an HTML fragment using the default configuration, then customize allowed tags, and finally demonstrate creating a reusable `Cleaner` instance with specific rules for tags, attributes, and URL schemes.

import nh3

# Basic sanitization
unclean_html = "<b><img src='' onerror='alert(\'hax\')'>XSS?</b><script>alert('malicious')</script><p>Hello</p>"
cleaned_html = nh3.clean(unclean_html)
print(f"Cleaned HTML: {cleaned_html}")

# Customizing allowed tags
custom_cleaned = nh3.clean(unclean_html, tags={"b", "p"})
print(f"Custom cleaned (b, p only): {custom_cleaned}")

# Using Cleaner for reusable configuration
my_cleaner = nh3.Cleaner(
    tags={"b", "i", "p"},
    attributes={
        "a": {"href"},
        "img": {"src"}
    },
    url_schemes={"http", "https"}
)
reusable_cleaned = my_cleaner.clean("<a href='javascript:alert(\"xss\")'>Click</a><b>Bold</b><i>Italic</i><script>evil</script>")
print(f"Reusable cleaner output: {reusable_cleaned}")

view raw JSON →