StringDecoder for Userland

1.3.0 · maintenance · verified Sun Apr 19

The `string_decoder` package provides a userland implementation of the Node.js core `string_decoder` module. It is designed to correctly decode `Buffer` objects into strings, specifically handling multi-byte UTF-8 and UTF-16 characters that may span across multiple buffer chunks. This prevents issues like malformed characters when processing streamed or chunked data. Maintained by the Node.js Streams Working Group, it offers a stable and reliable solution for character decoding outside of the core Node.js environment. The current stable version is 1.3.0, released 7 years ago, indicating its maturity. Prior to version 1.0.0, its versions mirrored those of Node.js core; since 1.0.0, it adheres to Semantic Versioning. Its key differentiator is being a direct, semantically versioned mirror of the high-performance Node.js core implementation, making it suitable for environments where Node's built-in module is not directly available or a specific version parity is required, such as in browserify bundles.

Common errors

Warnings

Install

Imports

Quickstart

Demonstrates how to use `StringDecoder` to correctly handle multi-byte UTF-8 characters split across multiple `Buffer` chunks, preventing data corruption.

import { StringDecoder } from 'string_decoder';
import { Buffer } from 'buffer';

const decoder = new StringDecoder('utf8');

// Imagine receiving a multi-byte character (like '€') split across network packets.
// The Euro symbol (€) is U+20AC, which is E2 82 AC in UTF-8.

const chunk1 = Buffer.from([0xE2]); // First byte of '€'
const chunk2 = Buffer.from([0x82]); // Second byte of '€'
const chunk3 = Buffer.from([0xAC, 0x61, 0x62]); // Third byte of '€' plus 'ab'

let decodedString = '';
decodedString += decoder.write(chunk1); // Should output '' (incomplete char buffered)
decodedString += decoder.write(chunk2); // Should output '' (still incomplete)
decodedString += decoder.write(chunk3); // Should output '€ab' (now complete and subsequent chars)
decodedString += decoder.end(); // Any remaining buffered characters are flushed

console.log(decodedString);
// Expected output: '€ab'

// Without StringDecoder, a simple buffer.toString() on chunks could lead to replacement characters.
const simpleConcat = Buffer.concat([chunk1, chunk2, chunk3]).toString('utf8');
console.log(simpleConcat); 
// Expected output: '€ab' (for this specific example, but not reliable with *any* partial data)

view raw JSON →