HTML to Markdown Converter
node-html-markdown is a high-performance, cross-platform library designed for converting HTML content into Markdown, compatible with both Node.js and browser environments. The current stable version is 2.0.0. The library's primary focus is on speed, with benchmarks indicating significantly faster conversion rates compared to popular alternatives like Turndown. Additionally, it emphasizes generating human-readable Markdown output by producing clean, concise results with consistent spacing, aiming to avoid common formatting issues such as excessive line breaks. While it does not adhere to a rigid release schedule, the project is actively maintained, with regular updates that have historically included features like table support and fixes for browser compatibility, ensuring its reliability for converting large volumes of HTML data.
Warnings
- breaking Version 2.0.0 introduces a breaking change by explicitly denying special handling for certain elements that should not be present inside code blocks. This can affect how HTML within `<pre><code>` tags is rendered.
- gotcha The `NodeHtmlMarkdown` instance performs better when reused for multiple conversions. Instantiating a new converter for every conversion carries an overhead that can impact performance, especially with large volumes of data.
- gotcha For browser environments, the `preferNativeParser` option defaults to `false`. Setting it to `true` can leverage the native `DOMParser` for potentially better performance and compatibility.
- breaking Version 2.0.0 explicitly requires Node.js version 20.0.0 or higher. Running on older Node.js versions will result in runtime errors or unexpected behavior.
Install
-
npm install node-html-markdown -
yarn add node-html-markdown -
pnpm add node-html-markdown
Imports
- NodeHtmlMarkdown
const NodeHtmlMarkdown = require('node-html-markdown')import { NodeHtmlMarkdown } from 'node-html-markdown' - NodeHtmlMarkdownOptions
import { NodeHtmlMarkdownOptions } from 'node-html-markdown' - translate (static)
const nhm = new NodeHtmlMarkdown(); nhm.translate(htmlString); // if only used once
NodeHtmlMarkdown.translate(htmlString)
Quickstart
import { NodeHtmlMarkdown } from 'node-html-markdown';
// Initialize with options for a reusable instance
// Custom options can improve output or handle specific HTML structures.
const nhm = new NodeHtmlMarkdown(
{
codeFence: '```', // Customize code block fence (default: ```)
bulletMarker: '-', // Use hyphens for list bullets (default: *)
codeBlockStyle: 'fenced', // Prefer fenced code blocks (default: fenced)
preferNativeParser: typeof window !== 'undefined' // Use native DOMParser in browsers if available
},
/* customTranslators (optional) */ undefined,
/* customCodeBlockTranslators (optional) */ undefined
);
// Example HTML input to be converted
const htmlContent = `
<h1>Welcome to my document</h1>
<p>This is a <b>paragraph</b> with <i>some emphasis</i> and a <s>strikethrough</s> word.</p>
<ul>
<li>First item in a list.</li>
<li>Second item.</li>
</ul>
<pre><code>function greet() {
console.log('Hello, world!');
}
greet();</code></pre>
<p>Check out our <a href="https://example.com/docs">documentation</a>.</p>
<table><thead><tr><th>Header 1</th><th>Header 2</th></tr></thead><tbody><tr><td>Data 1</td><td>Data 2</td></tr></tbody></table>
`;
// Convert a single HTML string to Markdown
const markdownOutput = nhm.translate(htmlContent);
console.log('Converted Markdown Output:\n');
console.log(markdownOutput);
// The instance can be reused for multiple conversions efficiently
const anotherHtml = '<div><p>Another piece of HTML.</p></div>';
const anotherMarkdown = nhm.translate(anotherHtml);
console.log('\nAnother Converted Markdown:\n');
console.log(anotherMarkdown);