HTML Sanitizer

JSON →
library 2.6.0 ·python
verified May 23, 2026

This is an allowlist-based and very opinionated HTML sanitizer for Python, designed to clean up HTML fragments from untrusted or trusted sources. It's built upon `lxml` to ensure valid and safe HTML output. Beyond basic tag and attribute allowlisting, it applies additional transforms to normalize and simplify HTML content, aiming for consistency, especially from rich text editors. It's actively maintained.

total hits 19
actors 6 distinct systems
last hit 1d ago ByteDance
ByteDance
3
GPTBot
2
Script
2

top countries 🇩🇪 Germany · 🇸🇬 Singapore · 🇨🇦 Canada · 🇫🇮 Finland · 🇺🇸 United States