HAST Node Plain-Text Converter

3.0.1 · active · verified Sun Apr 19

hast-util-to-string is a utility within the `unified` ecosystem designed to extract the plain-text value of a `hast` (HTML Abstract Syntax Tree) node. It strictly mimics the DOM's `Node#textContent` getter, returning all textual content regardless of styling or layout, and importantly, it does not interpret HTML elements like `<br>` as introducing newlines. This behavior differentiates it from `hast-util-to-text`, which emulates `Node#innerText` by considering rendered output. The current stable version is 3.0.1. As part of the actively maintained `unified` collective, it follows a release cadence tied to the broader ecosystem, with major versions often introducing updated Node.js requirements (e.g., Node.js 16+ for v3) and migrating to modern JavaScript module practices, including being ESM-only and utilizing package `exports` fields. The library provides comprehensive TypeScript type definitions, ensuring robust development.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to convert both simple and complex HAST nodes into their plain-text representations using `hast-util-to-string`, highlighting its `Node#textContent`-like behavior.

import {h, type Element} from 'hastscript';
import {toString} from 'hast-util-to-string';

// Create a simple HAST paragraph node
const paragraphNode: Element = h('p', 'This is a simple paragraph.');
console.log('Input HAST node (paragraph):', JSON.stringify(paragraphNode, null, 2));
console.log('Plain text output:', toString(paragraphNode));
// Expected output: 'This is a simple paragraph.'

// Create a more complex HAST div node with nested elements and a break tag
const complexNode: Element = h('div', [
  h('b', 'Bold text'),
  ' and ', 
  h('i', 'italic text'),
  h('br'), // A break tag
  ' on the same line according to textContent (no newline).' // textContent doesn't add newline for <br>
]);
console.log('\nInput HAST node (complex div):', JSON.stringify(complexNode, null, 2));
console.log('Plain text output:', toString(complexNode));
// Expected output: 'Bold text and italic text. on the same line according to textContent (no newline).'
// This demonstrates that <br> tags are ignored when mimicking Node#textContent behavior.

view raw JSON →