{"id":15590,"library":"dom-parser","title":"Fast Regex-Based DOM Parser","description":"dom-parser is a lightweight, zero-dependency library for parsing HTML and XML documents into a DOM-like structure using regular expressions. It provides a subset of standard DOM API methods like `getElementById`, `getElementsByClassName`, and `getElementsByTagName`, along with common node properties such as `innerHTML` and `textContent`. The current stable version is 1.1.5. Due to its regexp-based parsing, it is notably fast and compact, making it suitable for environments where full-fledged, specification-compliant DOM parsing (like `jsdom` or browser DOMParser) is overkill or too resource-intensive. Its main differentiator is performance and minimal footprint by leveraging regexps, though this approach might have limitations with highly malformed or complex HTML structures compared to state-machine parsers.","status":"active","version":"1.1.5","language":"javascript","source_language":"en","source_url":"https://github.com/ershov-konst/dom-parser","tags":["javascript","domparser","dom","parser","xml","html","xmlparser","htmlparser","scraping","typescript"],"install":[{"cmd":"npm install dom-parser","lang":"bash","label":"npm"},{"cmd":"yarn add dom-parser","lang":"bash","label":"yarn"},{"cmd":"pnpm add dom-parser","lang":"bash","label":"pnpm"}],"dependencies":[],"imports":[{"note":"The library primarily uses named exports. While CommonJS `require` might work in some transpiled environments, native ESM import is the intended and officially supported way for modern JavaScript and TypeScript projects.","wrong":"const { parseFromString } = require('dom-parser');","symbol":"parseFromString","correct":"import { parseFromString } from 'dom-parser';"},{"note":"When importing the type for the parsed DOM object (returned by `parseFromString`), use `import type` for clarity and to ensure it's stripped from the JavaScript output.","symbol":"Dom","correct":"import type { Dom } from 'dom-parser';"},{"note":"To reference the type for individual DOM nodes, `import type { Node } from 'dom-parser';` is the correct approach. This represents the generic node interface with properties like `nodeName`, `attributes`, and methods like `getAttribute`.","symbol":"Node","correct":"import type { Node } from 'dom-parser';"}],"quickstart":{"code":"import { parseFromString } from 'dom-parser';\n\n// Simulate reading an HTML file asynchronously\nasync function simulateReadFile(filePath: string): Promise<string> {\n  if (filePath === 'htmlToParse.html') {\n    return `\n      <div id=\"rootNode\">\n        <p class=\"childNodeClass\">Hello from child 1</p>\n        <span class=\"childNodeClass\">Hello from child 2</span>\n        <a href=\"#\" name=\"mylink\">Link</a>\n      </div>\n      <div class=\"childNodeClass\">Another root child</div>\n    `;\n  }\n  return '';\n}\n\nasync function main() {\n  const html = await simulateReadFile('htmlToParse.html');\n\n  // Getting DOM model\n  const dom = parseFromString(html);\n\n  // Searching Nodes\n  const rootNode = dom.getElementById('rootNode');\n  if (rootNode) {\n    console.log('Found rootNode with id:', rootNode.nodeName);\n    const childNodes = rootNode.getElementsByClassName('childNodeClass');\n    console.log('Children with class \"childNodeClass\":', childNodes.length);\n    childNodes.forEach(node => console.log(' - Child text:', node.textContent));\n\n    const myLink = rootNode.getElementsByName('mylink')[0];\n    if (myLink) {\n      console.log('Found link href:', myLink.getAttribute('href'));\n    }\n  }\n}\n\nmain();","lang":"typescript","description":"This example demonstrates parsing an HTML string, finding elements by ID, and then by class name within a specific element, and accessing attributes."},"warnings":[{"fix":"For mission-critical applications or when dealing with highly varied and potentially non-standard HTML, consider using a full HTML5 compliant parser (e.g., `jsdom` in Node.js or `DOMParser` in browsers) if strict parsing rules are required. Test thoroughly with your specific HTML inputs.","message":"Due to its RegExp-based parsing approach, `dom-parser` might not always produce a DOM structure identical to what a browser's native DOMParser or a compliant library like `jsdom` would for highly malformed or edge-case HTML. It prioritizes speed and simplicity over full HTML5 specification compliance.","severity":"gotcha","affected_versions":">=1.0.0"},{"fix":"Review the API documentation carefully for available methods and properties. If you require functionality not present in `dom-parser`, you may need to implement custom logic or opt for a more comprehensive DOM library.","message":"The `Node` API provided by `dom-parser` is a subset of the standard browser DOM API. While common methods like `getElementById`, `getElementsByClassName`, and properties like `innerHTML` are present, more advanced features or less common properties/methods of the native DOM (e.g., `querySelector`, event handling, style manipulation) are not implemented.","severity":"gotcha","affected_versions":">=1.0.0"}],"env_vars":null,"last_verified":"2026-04-21T00:00:00.000Z","next_check":"2026-07-20T00:00:00.000Z","problems":[{"fix":"Ensure you have correctly called `parseFromString(htmlContent)` and are calling methods on the returned `Dom` object. Example: `const dom = parseFromString(html); const root = dom.getElementById('myId');`","cause":"Attempting to call a DOM method on `dom` directly before parsing, or on an incorrect object.","error":"TypeError: dom.getElementById is not a function"},{"fix":"Verify that `dom-parser`'s types are correctly installed and configured. Ensure you are importing `Node` from `dom-parser` if you are explicitly typing your variables. If the issue persists, consider type assertion: `(myNode as any).textContent` or more specifically `(myNode as HtmlNode).textContent` if `HtmlNode` type is exported and applicable.","cause":"TypeScript error indicating that the `Node` type might not explicitly declare `textContent` (though the library's `Node` interface *does* have it, this can happen if types are misaligned or if a different `Node` type is implicitly used).","error":"Property 'textContent' does not exist on type 'Node'"}],"ecosystem":"npm"}