HAST Plain-Text Extraction
hast-util-to-text is a utility for the unified ecosystem that extracts the plain-text value from a HAST (HTML Abstract Syntax Tree) node. It approximates the DOM's `Node#innerText` algorithm, which is more user-friendly than `Node#textContent` (like `hast-util-to-string`) by converting `<br>` elements into line breaks and using tabs (`\t`) between table cells. The package is currently at version 4.0.2, actively maintained, and primarily releases patch versions for fixes and minor updates for new features, with major versions reserved for breaking changes. Its key differentiator is its adherence to the `innerText`-like behavior, providing a textual representation that reflects how content would be visually rendered, although it cannot account for dynamic CSS properties like `display: none` or `text-transform`.
Common errors
-
ERR_REQUIRE_ESM
cause Attempting to `require()` an ESM-only package.fixChange `const { toText } = require('hast-util-to-text')` to `import { toText } from 'hast-util-to-text'`. -
TypeError: toText is not a function
cause Incorrect import syntax (e.g., default import when only named exports exist, or attempting to destructure from an incorrectly transpiled CJS module).fixEnsure you are using named imports: `import { toText } from 'hast-util-to-text'`. -
Error [ERR_PACKAGE_PATH_NOT_EXPORTED]: Package 'hast-util-to-text' was not found at package.json#exports.
cause Using an older Node.js version (<16) that doesn't fully support the `exports` field in `package.json`, or an outdated bundler/tooling.fixUpdate Node.js to version 16 or newer. Ensure your build tooling (e.g., webpack, Rollup, Parcel) is up-to-date and configured to handle ESM and `exports` maps correctly.
Warnings
- breaking Version 4.0.0 changed to require Node.js 16 or higher. Older Node.js versions are no longer supported.
- breaking Version 4.0.0 changed the package to use the `exports` field in `package.json`, which affects how it can be imported, particularly in CommonJS environments or older bundlers. It's designed for modern ESM import patterns.
- breaking Version 3.0.0 converted the package to be ESM-only. CommonJS `require()` statements will no longer work.
- breaking Version 2.0.0 updated `unist-util-find-after`, which could be a breaking change, particularly for TypeScript users or dependents relying on specific type definitions.
- gotcha This utility's `innerText` algorithm is an approximation and deviates from the DOM specification in some cases. It cannot account for CSS properties like `display: none` or `text-transform` that dynamically alter text visibility or appearance, nor does it process replaced elements (e.g., `<audio>`) as the DOM would.
Install
-
npm install hast-util-to-text -
yarn add hast-util-to-text -
pnpm add hast-util-to-text
Imports
- toText
const toText = require('hast-util-to-text')import { toText } from 'hast-util-to-text' - Options
import type { Options } from 'hast-util-to-text' - Whitespace
import type { Whitespace } from 'hast-util-to-text'
Quickstart
import {h} from 'hastscript'
import {toText} from 'hast-util-to-text'
const tree = h('div', [
h('h1', {hidden: true}, 'Alpha.'),
h('article', [
h('p', ['Bravo', h('br'), 'charlie.']), // <br> will become a newline
h('p', 'Delta echo \t foxtrot.') // Tab will be preserved
]),
h('table', [
h('tr', [
h('td', 'Cell 1'),
h('td', 'Cell 2')
])
])
])
console.log(toText(tree));
// Expected output:
// Bravo
// charlie.
//
// Delta echo foxtrot.
// Cell 1 Cell 2