{"id":12088,"library":"string_decoder","title":"StringDecoder for Userland","description":"The `string_decoder` package provides a userland implementation of the Node.js core `string_decoder` module. It is designed to correctly decode `Buffer` objects into strings, specifically handling multi-byte UTF-8 and UTF-16 characters that may span across multiple buffer chunks. This prevents issues like malformed characters when processing streamed or chunked data. Maintained by the Node.js Streams Working Group, it offers a stable and reliable solution for character decoding outside of the core Node.js environment. The current stable version is 1.3.0, released 7 years ago, indicating its maturity. Prior to version 1.0.0, its versions mirrored those of Node.js core; since 1.0.0, it adheres to Semantic Versioning. Its key differentiator is being a direct, semantically versioned mirror of the high-performance Node.js core implementation, making it suitable for environments where Node's built-in module is not directly available or a specific version parity is required, such as in browserify bundles.","status":"maintenance","version":"1.3.0","language":"javascript","source_language":"en","source_url":"git://github.com/nodejs/string_decoder","tags":["javascript","string","decoder","browser","browserify"],"install":[{"cmd":"npm install string_decoder","lang":"bash","label":"npm"},{"cmd":"yarn add string_decoder","lang":"bash","label":"yarn"},{"cmd":"pnpm add string_decoder","lang":"bash","label":"pnpm"}],"dependencies":[{"reason":"Provides a safe buffer implementation, especially for older Node.js versions or environments where Buffer API differences might exist.","package":"safe-buffer","optional":false}],"imports":[{"note":"ESM named import is preferred in modern Node.js and bundlers. The package's `main` entry point is CommonJS, so direct ESM import relies on Node.js's CJS-ESM interop or a bundler.","wrong":"const StringDecoder = require('string_decoder');","symbol":"StringDecoder","correct":"import { StringDecoder } from 'string_decoder';"},{"note":"The `StringDecoder` class is the primary export. Destructuring `require` is common. For older Node.js or bundler setups, `require('string_decoder').StringDecoder` also works but destructuring is cleaner.","wrong":"const StringDecoder = require('string_decoder').StringDecoder;","symbol":"StringDecoder (CommonJS)","correct":"const { StringDecoder } = require('string_decoder');"}],"quickstart":{"code":"import { StringDecoder } from 'string_decoder';\nimport { Buffer } from 'buffer';\n\nconst decoder = new StringDecoder('utf8');\n\n// Imagine receiving a multi-byte character (like '€') split across network packets.\n// The Euro symbol (€) is U+20AC, which is E2 82 AC in UTF-8.\n\nconst chunk1 = Buffer.from([0xE2]); // First byte of '€'\nconst chunk2 = Buffer.from([0x82]); // Second byte of '€'\nconst chunk3 = Buffer.from([0xAC, 0x61, 0x62]); // Third byte of '€' plus 'ab'\n\nlet decodedString = '';\ndecodedString += decoder.write(chunk1); // Should output '' (incomplete char buffered)\ndecodedString += decoder.write(chunk2); // Should output '' (still incomplete)\ndecodedString += decoder.write(chunk3); // Should output '€ab' (now complete and subsequent chars)\ndecodedString += decoder.end(); // Any remaining buffered characters are flushed\n\nconsole.log(decodedString);\n// Expected output: '€ab'\n\n// Without StringDecoder, a simple buffer.toString() on chunks could lead to replacement characters.\nconst simpleConcat = Buffer.concat([chunk1, chunk2, chunk3]).toString('utf8');\nconsole.log(simpleConcat); \n// Expected output: '€ab' (for this specific example, but not reliable with *any* partial data)\n","lang":"javascript","description":"Demonstrates how to use `StringDecoder` to correctly handle multi-byte UTF-8 characters split across multiple `Buffer` chunks, preventing data corruption."},"warnings":[{"fix":"Always check release notes when upgrading from versions older than 1.0.0, as behavior might have changed non-semantically. For versions >=1.0.0, follow standard SemVer practices.","message":"Prior to version 1.0.0, `string_decoder` versions mirrored Node.js core versions, which did not follow semantic versioning. Starting with 1.0.0, the package adopted standard semantic versioning, meaning major version bumps now indicate breaking changes in this userland package, independent of Node.js core.","severity":"breaking","affected_versions":"<1.0.0"},{"fix":"Always use `StringDecoder.write()` for chunks of data that might contain partial multi-byte characters. Only use `Buffer.prototype.toString()` on complete, known-valid buffers or when `StringDecoder.end()` is called to flush remaining buffered bytes.","message":"The `string_decoder` module is specifically designed to correctly handle multi-byte characters that are split across `Buffer` instances when streamed. Simply concatenating buffers and then calling `Buffer.prototype.toString()` might result in replacement characters (�) for improperly split multi-byte sequences.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For new projects or cross-platform code, consider using `TextDecoder` for decoding binary data to strings. `string_decoder` remains relevant for Node.js-specific compatibility layers or older codebases.","message":"In modern JavaScript environments, the WHATWG `TextDecoder` API (`new TextDecoder('utf-8')`) is the generally recommended and more broadly compatible alternative for decoding text from binary data, especially in browser and Web Worker contexts. `string_decoder` is considered a legacy utility module for Node.js compatibility.","severity":"deprecated","affected_versions":"All versions"},{"fix":"Verify bundler configurations. For Webpack, ensure `node: { string_decoder: 'mock' }` or similar. For Rollup/esbuild, check plugins that handle Node.js built-ins or explicitly externalize/alias if necessary. Update to newer bundler versions which might have better built-in support.","message":"When bundling for the browser using tools like Webpack or Rollup, ensure that `string_decoder` is correctly aliased or handled. As it's a Node.js core module mirror, bundlers might incorrectly assume it's a Node.js global or misinterpret its import path, leading to errors like 'StringDecoder' is not exported by '.../lib/string_decoder.js'.","severity":"gotcha","affected_versions":"All versions when bundling for browser"}],"env_vars":null,"last_verified":"2026-04-19T00:00:00.000Z","next_check":"2026-07-18T00:00:00.000Z","problems":[{"fix":"Break very large input buffers into smaller chunks before passing them to `stringDecoder.write()`. Node.js has a maximum string length (e.g., ~536MB in V8), which applies to the output of `string_decoder`. This issue was addressed in later Node.js versions to throw `ERR_STRING_TOO_LONG`.","cause":"Attempting to decode an excessively large buffer, exceeding V8's maximum string length. In older Node.js versions, `string_decoder.write()` might return `undefined` instead of throwing an explicit error.","error":"TypeError: Cannot read property 'length' of undefined"},{"fix":"Ensure your bundler correctly handles CommonJS modules and Node.js built-ins. This might involve adding a CJS plugin (e.g., `@rollup/plugin-commonjs`), configuring aliases, or explicitly telling the bundler to treat `string_decoder` as an external dependency.","cause":"This error typically occurs in bundling environments (like Rollup or esbuild) where the bundler tries to resolve `string_decoder` as an ESM module and fails to find a named export `StringDecoder`, often due to it being a CommonJS module by default.","error":"Error: 'StringDecoder' is not exported by ../../../../../../../../node_modules/string_decoder/lib/string_decoder.js, imported by node_modules/@frida/readable-stream/lib/readable.js"},{"fix":"Replace direct `Buffer.prototype.toString()` calls on streamed or chunked data with `StringDecoder.write()` and `StringDecoder.end()`. This ensures that multi-byte characters are correctly assembled before decoding.","cause":"You are likely using `Buffer.prototype.toString()` directly on partial buffers that contain incomplete multi-byte characters. `StringDecoder` is specifically designed to prevent this by buffering incomplete sequences.","error":"character '�' (U+FFFD) appears in output unexpectedly"}],"ecosystem":"npm"}