{"id":11019,"library":"hast-util-to-nlcst","title":"HAST to NLCST Transformer","description":"hast-util-to-nlcst is a utility package within the unified/syntax-tree ecosystem designed to transform a HAST (HTML Abstract Syntax Tree) into an NLCST (Natural Language Concrete Syntax Tree). This transformation extracts the natural language content from an HTML structure, making it suitable for natural language processing tasks such as linting, sentiment analysis, or spell checking with tools like retext. The package is currently stable at version 4.0.0 and follows a semver release cadence, with major versions often introducing breaking changes related to environment support (e.g., Node.js versions, ESM-only) or parser API updates. A key differentiator is its focused role in bridging HTML content to natural language processing within the unist AST family, though it currently lacks a mechanism to apply changes back from NLCST to HAST. It is often used in conjunction with parsers like `parse-english` and wrappers like `rehype-retext`.","status":"active","version":"4.0.0","language":"javascript","source_language":"en","source_url":"https://github.com/syntax-tree/hast-util-to-nlcst","tags":["javascript","unist","hast","hast-util","util","utility","rehype","retext","nlcst","typescript"],"install":[{"cmd":"npm install hast-util-to-nlcst","lang":"bash","label":"npm"},{"cmd":"yarn add hast-util-to-nlcst","lang":"bash","label":"yarn"},{"cmd":"pnpm add hast-util-to-nlcst","lang":"bash","label":"pnpm"}],"dependencies":[{"reason":"This utility requires an NLCST parser (e.g., `parse-english`, `parse-latin`, `parse-dutch`) to be provided as a parameter to its `toNlcst` function. While not a direct `peerDependency` in package.json, it's a functional requirement for practical use.","package":"parse-english","optional":true}],"imports":[{"note":"This package is ESM-only since v2.0.0 and requires Node.js 16+ since v4.0.0. CommonJS `require` statements will fail.","wrong":"const toNlcst = require('hast-util-to-nlcst')","symbol":"toNlcst","correct":"import { toNlcst } from 'hast-util-to-nlcst'"},{"note":"Use `import type` for type-only imports to prevent bundling issues and ensure correct type inference.","symbol":"ParserConstructor","correct":"import type { ParserConstructor } from 'hast-util-to-nlcst'"},{"note":"Use `import type` for type-only imports.","symbol":"ParserInstance","correct":"import type { ParserInstance } from 'hast-util-to-nlcst'"}],"quickstart":{"code":"import { fromHtml } from 'hast-util-from-html';\nimport { toNlcst } from 'hast-util-to-nlcst';\nimport { ParseEnglish } from 'parse-english';\nimport { readSync } from 'to-vfile';\nimport { inspect } from 'unist-util-inspect';\nimport * as fs from 'fs';\nimport * as path from 'path';\n\n// Create a dummy HTML file for the example\nconst exampleHtmlContent = `\n<article>\n  Implicit.\n  <h1>Explicit: <strong>foo</strong>s-ball</h1>\n  <pre><code class=\"language-foo\">bar()</code></pre>\n</article>\n`;\nconst exampleHtmlPath = path.join(process.cwd(), 'example.html');\nfs.writeFileSync(exampleHtmlPath, exampleHtmlContent);\n\n// Read the virtual file\nconst file = readSync(exampleHtmlPath);\n// Parse HTML string to HAST\nconst tree = fromHtml(file);\n\n// Transform HAST to NLCST using ParseEnglish\nconst nlcstTree = toNlcst(tree, file, ParseEnglish);\n\n// Log the inspected NLCST tree (positional info removed for brevity)\nconsole.log(inspect(nlcstTree));\n\n// Clean up the dummy file\nfs.unlinkSync(exampleHtmlPath);","lang":"typescript","description":"This example demonstrates how to parse an HTML string into a HAST tree, then convert that HAST tree into an NLCST tree using `hast-util-to-nlcst` with `ParseEnglish`. It then uses `unist-util-inspect` to log the resulting natural language tree structure, showing how text content from HTML elements is represented."},"warnings":[{"fix":"Upgrade your Node.js environment to version 16 or newer.","message":"Version 4.0.0 introduces a requirement for Node.js version 16 or higher. Running on older Node.js environments will lead to runtime errors or module resolution failures.","severity":"breaking","affected_versions":">=4.0.0"},{"fix":"Ensure your build tools and Node.js version fully support the `exports` field. If experiencing issues, verify your `tsconfig.json` (for TypeScript) or bundler configuration is up-to-date.","message":"Version 4.0.0 changed to use the `exports` field in `package.json`, which affects module resolution behavior, especially in some bundlers or older Node.js versions. This may require adjustments to build configurations.","severity":"breaking","affected_versions":">=4.0.0"},{"fix":"Update the NLCST parser packages you are using (e.g., `npm install parse-english@latest`) and ensure your code passes their latest constructor function.","message":"Version 3.0.0 introduced breaking changes related to the NLCST parsers. The `Parser` argument passed to `toNlcst` must be updated to the latest compatible version (e.g., `parse-latin`, `parse-english`, `parse-dutch`).","severity":"breaking","affected_versions":">=3.0.0"},{"fix":"Refactor your code to use ES module `import` statements. Ensure your project is configured for ESM, potentially by adding `\"type\": \"module\"` to your `package.json` or using `.mjs` file extensions.","message":"Version 2.0.0 switched the package to be ESM-only (ECMAScript Modules). CommonJS `require()` statements are no longer supported and will result in module loading errors.","severity":"breaking","affected_versions":">=2.0.0"},{"fix":"When creating your HAST tree, ensure that the parser used (e.g., `hast-util-from-html`, `rehype-parse`) is configured to retain positional information. For example, `fromHtml` automatically handles this when given a `VFile`.","message":"The `toNlcst` function requires the input HAST `tree` to have positional information (line, column, offset data) for accurate NLCST conversion. The `VFile` passed must also correspond directly to the `tree`.","severity":"gotcha","affected_versions":"*"}],"env_vars":null,"last_verified":"2026-04-19T00:00:00.000Z","next_check":"2026-07-18T00:00:00.000Z","problems":[{"fix":"Change `const { toNlcst } = require('hast-util-to-nlcst')` to `import { toNlcst } from 'hast-util-to-nlcst'`. Ensure your project runs in an ESM context (e.g., `type: \"module\"` in `package.json` or `.mjs` file extension).","cause":"Attempting to use `require()` to import `hast-util-to-nlcst`, which is an ESM-only package.","error":"ERR_REQUIRE_ESM"},{"fix":"Ensure your Node.js version is 16+ (required by v4.0.0), and add `\"type\": \"module\"` to your `package.json` file. Alternatively, rename your file to use the `.mjs` extension.","cause":"You are trying to use an `import` statement in a Node.js environment that is not configured for ES modules, or an older Node.js version.","error":"SyntaxError: Cannot use import statement outside a module"},{"fix":"Update your NLCST parser package (e.g., `npm install parse-english@latest`). Ensure you are passing the constructor function itself (e.g., `ParseEnglish`) and not an instance or an incorrect export.","cause":"The NLCST parser passed to `toNlcst` is either an outdated version or incorrectly imported/instantiated, especially after the v3.0.0 breaking change.","error":"TypeError: Parser is not a constructor"},{"fix":"Ensure the utility or parser you use to create the HAST tree preserves positional data. For instance, when using `hast-util-from-html`, ensure you pass a `VFile` object created from actual content, which typically retains this info.","cause":"The input HAST `tree` lacks positional information, which `hast-util-to-nlcst` relies on for accurate conversion.","error":"TypeError: Cannot read properties of undefined (reading 'position') or incorrect NLCST output with missing text."}],"ecosystem":"npm"}