HAST Utility for Parsing HTML

2.0.3 · active · verified Sun Apr 19

hast-util-from-html is a utility within the unifiedjs ecosystem that transforms serialized HTML strings into a HAST (Hypertext Abstract Syntax Tree) representation. The current stable version is 2.0.3. It maintains a relatively active release cadence, with recent patch releases addressing type issues and a major version bump (2.0.0) introducing significant changes like ESM-only support and Node.js 16+ requirement. This package is designed for scenarios where developers need to manually manipulate HTML syntax trees, offering granular control over parsing. It differentiates itself from `parse5` (a low-level HTML parser) by directly producing HAST nodes, and from higher-level abstractions like `rehype-parse`. For browser environments, `hast-util-from-html-isomorphic` offers a lighter, albeit less feature-rich, alternative.

Common errors

Warnings

Install

Imports

Quickstart

Demonstrates how to parse a simple HTML string into a HAST (Hypertext Abstract Syntax Tree) using the `fromHtml` function, showcasing both fragment and full document parsing.

import { fromHtml } from 'hast-util-from-html';
import type { Root } from 'hast';

const htmlInput = '<h1>Hello, <em>world</em>!</h1><p>This is a paragraph.</p>';

// Parse as a document fragment to avoid automatic <html>, <head>, <body> insertion.
const tree: Root = fromHtml(htmlInput, { fragment: true });

console.log(JSON.stringify(tree, null, 2));

// Example of parsing as a full document
const documentHtml = '<!DOCTYPE html><html><head><title>Test</title></head><body><h1>Doc</h1></body></html>';
const docTree: Root = fromHtml(documentHtml);
console.log('\n--- Full Document Parse ---\n');
console.log(JSON.stringify(docTree, null, 2));

view raw JSON →