{"id":16054,"library":"html-parser","title":"HTML Parser with Fault Tolerance and Sanitization","description":"The `html-parser` library provides a fault-tolerant parser for HTML and XML, designed to process even malformed input without 'explosions'. Its primary feature is robust sanitization capabilities, allowing developers to strip unwanted elements, attributes, and comments from untrusted HTML content. The library operates using a callback-based API, offering granular control over how various HTML tokens (elements, attributes, text, comments, CDATA, doctype) are handled during parsing. Currently at version 0.11.0 and last published over nine years ago, this package is no longer actively maintained. Its key differentiators historically were its resilience to invalid markup and its built-in, configurable sanitization features, making it suitable for preparing user-generated HTML for safe display, though its age raises concerns about modern security vulnerabilities.","status":"abandoned","version":"0.11.0","language":"javascript","source_language":"en","source_url":"git://github.com/tmont/html-parser","tags":["javascript","html","xml","parser","explosion"],"install":[{"cmd":"npm install html-parser","lang":"bash","label":"npm"},{"cmd":"yarn add html-parser","lang":"bash","label":"yarn"},{"cmd":"pnpm add html-parser","lang":"bash","label":"pnpm"}],"dependencies":[],"imports":[{"note":"This package is CommonJS-only and does not support ES module imports. Direct named imports are also not supported.","wrong":"import htmlParser from 'html-parser';","symbol":"htmlParser","correct":"const htmlParser = require('html-parser');"},{"note":"The `parse` method is a property of the main `htmlParser` object, not a direct named export. CommonJS `require` must be used.","wrong":"import { parse } from 'html-parser';","symbol":"parse","correct":"const htmlParser = require('html-parser');\nhtmlParser.parse(htmlString, callbacks);"},{"note":"The `sanitize` method is a property of the main `htmlParser` object, not a direct named export. CommonJS `require` must be used.","wrong":"import { sanitize } from 'html-parser';","symbol":"sanitize","correct":"const htmlParser = require('html-parser');\nconst sanitizedHtml = htmlParser.sanitize(htmlString, options);"}],"quickstart":{"code":"const htmlParser = require('html-parser');\n\nconst html = '<!doctype html><html><body onload=\"alert(\\'hello\\');\">Hello<br />world</body></html>';\n\nconsole.log('--- Parsing Example ---');\nhtmlParser.parse(html, {\n\topenElement: function(name) { console.log('open: %s', name); },\n\tcloseOpenedElement: function(name, token, unary) { console.log('token: %s, unary: %s', token, unary); },\n\tcloseElement: function(name) { console.log('close: %s', name); },\n\tcomment: function(value) { console.log('comment: %s', value); },\n\tcdata: function(value) { console.log('cdata: %s', value); },\n\tattribute: function(name, value) { console.log('attribute: %s=%s', name, value); },\n\tdocType: function(value) { console.log('doctype: %s', value); },\n\ttext: function(value) { console.log('text: %s', value); }\n});\n\nconst maliciousHtml = '<script>alert(\\'danger!\\')</script><p onclick=\"alert(\\'danger!\\')\">blah blah<!-- useless comment --></p>';\nconsole.log('\\n--- Sanitization Example ---');\nconst sanitized = htmlParser.sanitize(maliciousHtml, {\n\telements: [ 'script' ], // Elements to remove\n\tattributes: [ 'onclick' ], // Attributes to remove\n\tcomments: true // Remove comments\n});\nconsole.log('Original: %s', maliciousHtml);\nconsole.log('Sanitized: %s', sanitized);","lang":"javascript","description":"This quickstart demonstrates both the callback-based HTML parsing and the sanitization features of the library. It shows how to process an HTML string, logging events for various tokens, and how to remove malicious script tags, event attributes, and comments."},"warnings":[{"fix":"Migrate to a maintained HTML parsing and sanitization library like `htmlparser2` or `parse5` for parsing, and a dedicated sanitization library like `dompurify` for security.","message":"This package is pre-1.0 (v0.11.0) and abandoned, meaning its API is not stable and may have contained breaking changes between minor versions. There is no guarantee of backward compatibility.","severity":"breaking","affected_versions":"<=0.11.0"},{"fix":"Always use `const htmlParser = require('html-parser');` to import the library in Node.js environments.","message":"The package is CommonJS-only and does not provide ES module exports. Attempting to use `import` statements will result in errors.","severity":"gotcha","affected_versions":">=0.1.0"},{"fix":"For secure HTML sanitization, use actively maintained and peer-reviewed libraries such as `dompurify`. Consider server-side sanitization as a primary defense.","message":"The `sanitize` function, while provided, relies on simple element/attribute blacklists or callback logic. Given the library's abandonment, it is highly unlikely to be robust against modern XSS vectors and other security vulnerabilities. It should not be solely relied upon for security-critical sanitization without thorough, independent auditing.","severity":"gotcha","affected_versions":">=0.1.0"},{"fix":"It is strongly recommended to migrate to a modern, actively maintained HTML parsing and sanitization library.","message":"This package is over nine years old and has not been updated. It may contain unpatched security vulnerabilities, performance issues, or incompatibilities with newer Node.js versions or browser environments.","severity":"gotcha","affected_versions":">=0.1.0"}],"env_vars":null,"last_verified":"2026-04-21T00:00:00.000Z","next_check":"2026-07-20T00:00:00.000Z","problems":[{"fix":"Ensure you are using CommonJS `require` and accessing `parse` as a method of the default export: `const htmlParser = require('html-parser'); htmlParser.parse(...)`","cause":"Attempting to use `import { parse } from 'html-parser';` or not correctly requiring the module.","error":"TypeError: htmlParser.parse is not a function"},{"fix":"Add `const htmlParser = require('html-parser');` at the top of your file to ensure the module is loaded and accessible.","cause":"The module was not correctly `require`d or is out of scope.","error":"ReferenceError: htmlParser is not defined"},{"fix":"Review the `sanitize` options carefully. `elements` and `attributes` arrays specify *what to remove*, or provide a callback function that returns `true` for items to be removed. Ensure `comments: true` is set if comments should be stripped.","cause":"Incorrect configuration of the `elements` or `attributes` options in the `sanitize` method, or using the `comments: false` option.","error":"Sanitized HTML still contains unwanted elements/attributes."}],"ecosystem":"npm"}