HTML Sanitizer

library 2.6.0 ·python

✓ verified May 23, 2026

This is an allowlist-based and very opinionated HTML sanitizer for Python, designed to clean up HTML fragments from untrusted or trusted sources. It's built upon `lxml` to ensure valid and safe HTML output. Beyond basic tag and attribute allowlisting, it applies additional transforms to normalize and simplify HTML content, aiming for consistency, especially from rich text editors. It's actively maintained.

Traffic · last 30 days ↑300% vs prev 7d · indexed Tue Apr 14 · updated Fri May 29

total hits 19

actors 6 distinct systems

last hit 1d ago ByteDance

ByteDance

GPTBot

Script

top countries 🇩🇪 Germany · 🇸🇬 Singapore · 🇨🇦 Canada · 🇫🇮 Finland · 🇺🇸 United States

Resources

githubgithub.com/matthiask/html-sanitizer ↗

packagepypi.org/project/html-sanitizer/ ↗

API endpoints

full doc /v1/registry/html-sanitizer

install /v1/registry/html-sanitizer/install

compatibility /v1/registry/html-sanitizer/compatibility