{"id":13300,"library":"html-dom-parser","title":"HTML DOM Parser","description":"html-dom-parser is a versatile JavaScript library designed to convert HTML strings into a structured JavaScript object representation of the Document Object Model (DOM) tree. It operates effectively in both Node.js environments (leveraging `htmlparser2` and `domhandler` internally for performance) and client-side browser contexts (mimicking server parsing behavior using the native DOM API). Currently at version 7.0.1, the library maintains a relatively active release cadence, with multiple updates in recent months addressing bug fixes and dependency bumps. Its key differentiator lies in providing a consistent, serializable DOM-like output (Plain Old JavaScript Objects) across different JavaScript environments, making it suitable for server-side HTML manipulation, client-side virtual DOM implementations, or data extraction where direct DOM access might be unavailable or inefficient.","status":"active","version":"7.0.1","language":"javascript","source_language":"en","source_url":"https://github.com/remarkablemark/html-dom-parser","tags":["javascript","html-dom-parser","html","dom","parser","htmlparser2","pojo","typescript"],"install":[{"cmd":"npm install html-dom-parser","lang":"bash","label":"npm"},{"cmd":"yarn add html-dom-parser","lang":"bash","label":"yarn"},{"cmd":"pnpm add html-dom-parser","lang":"bash","label":"pnpm"}],"dependencies":[{"reason":"Core dependency for server-side HTML parsing, converting HTML string into a handler-friendly format. Major version updates in v6.0.0 and v7.0.0.","package":"htmlparser2","optional":false},{"reason":"Used by htmlparser2 to construct the JavaScript object DOM tree from the parsed HTML events. Major version updates in v6.0.0.","package":"domhandler","optional":false}],"imports":[{"note":"This is the primary default export for ESM environments.","symbol":"parse","correct":"import parse from 'html-dom-parser';"},{"note":"For CommonJS, the default export is exposed via the `.default` property. Directly requiring the package will return the module object, not the parser function itself.","wrong":"const parse = require('html-dom-parser');","symbol":"parse","correct":"const parse = require('html-dom-parser').default;"},{"note":"Used for type-checking and defining options when parsing HTML in server-side (Node.js) environments. This interface is exported directly by the package.","symbol":"ParserOptions","correct":"import type { ParserOptions } from 'html-dom-parser';"}],"quickstart":{"code":"import parse, { ParserOptions } from 'html-dom-parser';\n\n// Basic usage in an ESM environment (Node.js or browser)\nconst simpleHtml = '<p>Hello, <strong>World!</strong></p>';\nconst domNodes = parse(simpleHtml);\nconsole.log('Parsed simple HTML:', JSON.stringify(domNodes, null, 2));\n\n// Example with server-side options (Node.js only)\n// Options are passed through to htmlparser2 and domhandler.\nconst complexHtml = '<!DOCTYPE html><html><head><title>Test</title></head><body><script>alert(\"XSS\");</script><div>Text &amp; Entities</div></body></html>';\nconst parserOptions: ParserOptions = {\n  decodeEntities: true, // Decode HTML entities like &amp;\n  xmlMode: false,      // Treat input as HTML, not XML\n  lowerCaseTags: true  // Ensure all tag names are lowercase\n};\n\nconst serverDomNodes = parse(complexHtml, parserOptions);\nconsole.log('\\nParsed complex HTML with options:', JSON.stringify(serverDomNodes, null, 2));\n\n// Accessing properties of the parsed DOM\nconst firstNode = domNodes[0];\nif (firstNode && 'name' in firstNode) {\n  console.log('\\nFirst node tag name:', firstNode.name);\n  if (firstNode.children && firstNode.children.length > 0) {\n    const textNode = firstNode.children[0];\n    if ('data' in textNode) {\n      console.log('First child text data:', textNode.data);\n    }\n  }\n}","lang":"typescript","description":"This quickstart demonstrates how to parse a basic HTML string, including how to apply server-side parsing options for more complex scenarios, and access properties of the resulting DOM nodes."},"warnings":[{"fix":"Review your parsing logic and expected DOM structures, particularly for complex or malformed HTML inputs. Consult the `htmlparser2` v12 changelog for specific behavioral changes.","message":"Version 7.0.0 introduced a breaking change by bumping the `htmlparser2` dependency from 11.0.0 to 12.0.0. This update in `htmlparser2` aligns HTML parsing more closely with the WHATWG specification, which may lead to subtle changes in the parsed DOM structure, entity decoding, and handling of certain tag types, especially for raw-text and RCDATA elements (like `<iframe>`, `<textarea>`).","severity":"breaking","affected_versions":">=7.0.0"},{"fix":"If you relied on `formatAttributes` or `CARRIAGE_RETURN`, you will need to find alternative implementations or manually handle these aspects. Review the changelogs for `htmlparser2` and `domhandler` for changes in DOM node structure or parsing behavior that might affect your application.","message":"Version 6.0.0 brought several breaking changes, including the removal of `formatAttributes` and `CARRIAGE_RETURN` constants from the client-side exports. Additionally, `htmlparser2` was bumped from 10.1.0 to 11.0.0 and `domhandler` from 5.0.3 to 6.0.1.","severity":"breaking","affected_versions":">=6.0.0 <7.0.0"},{"fix":"Always use `import parse from 'html-dom-parser';` for ESM and `const parse = require('html-dom-parser').default;` for CommonJS to ensure correct module resolution. Upgrade to the latest stable version (7.x.x) to benefit from build and type fixes.","message":"Early versions (e.g., v5.1.5, v5.1.6) had issues with ESM build and type resolution, leading to `ModuleNotFoundError` in some environments. While these have been fixed, it highlights the importance of using the correct import syntax for your module system.","severity":"gotcha","affected_versions":"<5.1.7"},{"fix":"Ensure you are running version 5.1.8 or higher, or the latest 7.x.x series, to protect against this and other potential security vulnerabilities.","message":"A security fix in v5.1.8 addressed a polynomial regular expression used on uncontrolled data, which could potentially lead to a Regular Expression Denial of Service (ReDoS) vulnerability. While fixed, it underscores the importance of keeping the package updated.","severity":"gotcha","affected_versions":"<5.1.8"}],"env_vars":null,"last_verified":"2026-04-19T00:00:00.000Z","next_check":"2026-07-18T00:00:00.000Z","problems":[{"fix":"For CommonJS, use `const parse = require('html-dom-parser').default;` to correctly access the default exported parser function.","cause":"Attempting to use CommonJS `require('html-dom-parser')` directly as a function, which returns the module object, not the default export.","error":"TypeError: html_dom_parser_1.default is not a function"},{"fix":"Ensure your project is configured for ESM (`\"type\": \"module\"` in `package.json` or `.mjs` extension). If issues persist, verify that your build tools (e.g., Rollup, Webpack) are correctly bundling ESM-only dependencies for CJS output if required, and upgrade to the latest `html-dom-parser` version (>=5.1.7) which includes fixes for ESM bundling.","cause":"Incorrect module resolution or package configuration when using `html-dom-parser` in an ESM context, particularly in older versions or specific build setups.","error":"ERR_MODULE_NOT_FOUND: Cannot find package 'html-dom-parser' imported from ..."},{"fix":"Use type guards to narrow the type of the node before accessing specific properties, e.g., `if ('name' in node && node.type === 'tag') { console.log(node.name); }` or `if (node instanceof Element) { console.log(node.name); }` (though `Element` class might not be directly exported or identical to `domhandler`'s `Element` type).","cause":"TypeScript error when accessing properties like `name` or `attribs` on a generic `ChildNode` without narrowing its type. The output DOM nodes are typically `Element` or `Text` instances.","error":"Property 'name' does not exist on type 'ChildNode' | 'Element'."}],"ecosystem":"npm","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null,"pypi_latest":null,"cli_name":"","cli_version":null}