Japanese Character Encoding Conversion
encoding-japanese is a JavaScript library designed for detecting and converting various character encodings, with a strong focus on Japanese encodings like Shift_JIS, EUC-JP, and ISO-2022-JP, alongside common Unicode formats like UTF-8 and UTF-16. Unlike standard JavaScript string handling, which is internally UTF-16, this library processes encodings as arrays of character code values, allowing for conversions between diverse character sets. The current stable version is 2.2.0, with releases occurring infrequently, often driven by feature additions or maintenance updates. Its key differentiator is robust support for a wide range of Japanese encodings and its ability to handle character codes as arrays, making it suitable for binary data manipulation (e.g., with `Uint8Array` or Node.js `Buffer`).
Common errors
-
TypeError: Converting circular structure to JSON
cause Attempting to `JSON.stringify` an array of character codes (especially Uint8Array) without proper serialization, or passing non-serializable objects to functions.fixEnsure that if you need to serialize character code arrays, you convert them to a JSON-compatible format first, such as a regular array of numbers (`Array.from(uint8Array)`), or directly to a string via `Encoding.codeToString` before stringifying. -
Cannot find module 'encoding-japanese'
cause Incorrect import path or the package is not installed correctly, especially in environments with strict module resolution or when using specific build tools.fixVerify the package is installed (`npm install encoding-japanese`) and ensure your import statement is `import * as Encoding from 'encoding-japanese';` for ESM or `const Encoding = require('encoding-japanese');` for CommonJS. Check your bundler or TypeScript configuration if paths are being resolved incorrectly. -
Unrepresentable character error or incorrect character output after conversion.
cause Occurs when a character from the 'from' encoding cannot be represented in the 'to' encoding, and the `fallback` option is not handled or is set to 'error'.fixUse the `fallback` option in `Encoding.convert` (e.g., `fallback: 'html-entity'`, `fallback: 'ignore'`, or `fallback: 'error'`) to explicitly define how unrepresentable characters should be handled. If 'error' is used, wrap the conversion in a `try...catch` block to gracefully handle the error.
Warnings
- breaking Version 2.0.0 introduced breaking changes by adding a `fallback` option to `Encoding.convert`. Previously, unrepresentable characters might have been handled differently or implicitly, but now explicit handling through `fallback` options ('html-entity', 'ignore', 'error') is available. While adding functionality, existing code relying on previous implicit behavior for unrepresentable characters might need adjustment.
- gotcha JavaScript strings are internally UTF-16. `encoding-japanese` primarily works with arrays of character codes (e.g., `Uint8Array`). Direct string arguments to `convert` or `detect` are not the primary use case and might lead to unexpected results if not correctly converted to/from character code arrays first using `stringToCode` and `codeToString`.
- gotcha When using `Encoding.convert`, the `type` option determines the return format. Omitting it or using `type: 'array'` returns a plain number array, while `type: 'Uint8Array'` returns a TypedArray. Mixing these types in subsequent operations without explicit conversion can lead to errors.
Install
-
npm install encoding-japanese -
yarn add encoding-japanese -
pnpm add encoding-japanese
Imports
- Encoding
import Encoding from 'encoding-japanese';
import * as Encoding from 'encoding-japanese';
- Encoding.detect
import { detect } from 'encoding-japanese';import { detect } from 'encoding-japanese/lib/encoding'; - Encoding.convert
const { convert } = require('encoding-japanese');const Encoding = require('encoding-japanese'); const convertedData = Encoding.convert(...);
Quickstart
import * as Encoding from 'encoding-japanese';
// Example: Convert a Shift_JIS string to UTF-8
const shiftJisBytes = new Uint8Array([
0x82, 0xA0, 0x82, 0xA2, 0x82, 0xA4, 0x82, 0xA6, 0x82, 0xA8 // 'あいうえお' in Shift_JIS
]);
const utf8Bytes = Encoding.convert(shiftJisBytes, {
to: 'UTF8',
from: 'SJIS',
type: 'array'
});
const utf8String = Encoding.codeToString(utf8Bytes);
console.log('Original Shift_JIS (bytes):', shiftJisBytes);
console.log('Converted UTF-8 (bytes):', utf8Bytes);
console.log('Converted UTF-8 (string):', utf8String);
// Example: Detect encoding
const detectedEncoding = Encoding.detect(shiftJisBytes);
console.log('Detected encoding:', detectedEncoding);
// Example: URL encoding
const urlEncoded = Encoding.urlEncode('テスト&123');
console.log('URL Encoded:', urlEncoded);