HTML Parser with Fault Tolerance and Sanitization

JSON →
library 0.11.0 ·javascript abandoned
verified May 27, 2026

The `html-parser` library provides a fault-tolerant parser for HTML and XML, designed to process even malformed input without 'explosions'. Its primary feature is robust sanitization capabilities, allowing developers to strip unwanted elements, attributes, and comments from untrusted HTML content. The library operates using a callback-based API, offering granular control over how various HTML tokens (elements, attributes, text, comments, CDATA, doctype) are handled during parsing. Currently at version 0.11.0 and last published over nine years ago, this package is no longer actively maintained. Its key differentiators historically were its resilience to invalid markup and its built-in, configurable sanitization features, making it suitable for preparing user-generated HTML for safe display, though its age raises concerns about modern security vulnerabilities.

total hits 16
actors 7 distinct systems
last hit 1d ago human
MetaBot
4
GPTBot
2
Script
1
Search engines
2
Humans
2

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇫🇮 Finland · 🇫🇷 France · 🇩🇪 Germany