Micromark String Decoder Utility

2.0.1 · active · verified Sun Apr 19

micromark-util-decode-string is a low-level utility within the unifiedjs `micromark` ecosystem, specifically designed to process and decode character escapes and character references found in Markdown strings. It provides a single, focused function, `decodeString`, which is crucial for correctly interpreting content in areas like fenced code info strings, link destinations, labels, and titles according to the CommonMark specification. The current stable version for this utility is `2.0.1`, which is compatible with `micromark@3` and Node.js 16 and higher. While part of a larger monorepo, this package maintains its own versioning tied to its API stability. Its release cadence generally aligns with major `micromark` releases, often involving updates to drop support for unmaintained Node.js versions. As an ESM-only package, it adheres to modern JavaScript module standards. Its key differentiator is its precise adherence to Markdown specification for string decoding, making it an essential building block for custom `micromark` extensions or other Markdown processing tools.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates the `decodeString` function's capability to correctly interpret and resolve various character escapes and references found within Markdown strings, including HTML entities, numeric references, and backslash escapes. It covers common use cases for this low-level utility, which is fundamental to `micromark`'s parsing process.

import { decodeString } from 'micromark-util-decode-string';

// Example 1: Decoding HTML character references like `&`
console.log('Decoding HTML character references:');
const htmlRefString = 'This is a string with & entities ©.';
console.log(`Original: "${htmlRefString}"`);
console.log(`Decoded:  "${decodeString(htmlRefString)}"`);
// Expected output: Original: "This is a string with & entities ©."
//                  Decoded:  "This is a string with & entities ©."

console.log('\n---');

// Example 2: Decoding numeric character references (decimal and hexadecimal)
console.log('Decoding numeric character references:');
const numericRefString = 'Hello € world ☺!';
console.log(`Original: "${numericRefString}"`);
console.log(`Decoded:  "${decodeString(numericRefString)}"`);
// Expected output: Original: "Hello € world ☺!"
//                  Decoded:  "Hello € world ☺!"

console.log('\n---');

// Example 3: Decoding Markdown character escapes (e.g., backslash escapes)
console.log('Decoding Markdown character escapes:');
const escapeString = 'Special characters like \\*asterisks\\* and \\_underscores\\_ are unescaped.';
console.log(`Original: "${escapeString}"`);
console.log(`Decoded:  "${decodeString(escapeString)}"`);
// Expected output: Original: "Special characters like \*asterisks\* and \_underscores\_ are unescaped."
//                  Decoded:  "Special characters like *asterisks* and _underscores_ are unescaped."

console.log('\n---');

// Example 4: Mixed string with common markdown context, simulating internal micromark use
console.log('Mixed context decoding:');
const mixedString = 'A link title with some \\& escaped characters — "Example"';
console.log(`Original: "${mixedString}"`);
console.log(`Decoded:  "${decodeString(mixedString)}"`);
// Expected output: Original: "A link title with some \& escaped characters — "Example""
//                  Decoded:  "A link title with some & escaped characters — "Example""

// This utility is typically used internally by micromark or in custom micromark extensions
// for precise Markdown string processing.

view raw JSON →