{"id":13466,"library":"lindera-wasm-cjk-bundler","title":"Lindera WASM CJK Morphological Analyzer (Bundler)","description":"Lindera WASM CJK Bundler is a JavaScript package providing morphological analysis for Chinese, Japanese, and Korean languages, optimized for environments using module bundlers. It leverages WebAssembly (WASM) for efficient tokenization, integrating IPADIC (Japanese), ko-dic (Korean), and CC-CEDICT (Chinese) dictionaries directly into the bundle. The package is currently at version 2.3.4, with the main Lindera project actively releasing new major versions (e.g., v3.x) that introduce significant architectural changes, including new NAPI-RS Node.js bindings and updated package naming conventions. Lindera's core differentiator is its multilingual CJK support via WASM, offering performance benefits in browser and bundler-driven Node.js environments compared to pure JavaScript or server-side solutions, with a focus on embedded dictionaries for ease of use. It aims for ease of deployment in complex modern JavaScript ecosystems requiring bundling.","status":"active","version":"2.3.4","language":"javascript","source_language":"en","source_url":"https://github.com/lindera/lindera","tags":["javascript","morphological","analysis","library","wasm","webassembly","typescript"],"install":[{"cmd":"npm install lindera-wasm-cjk-bundler","lang":"bash","label":"npm"},{"cmd":"yarn add lindera-wasm-cjk-bundler","lang":"bash","label":"yarn"},{"cmd":"pnpm add lindera-wasm-cjk-bundler","lang":"bash","label":"pnpm"}],"dependencies":[],"imports":[{"note":"This is the default export and *must* be awaited to initialize the WebAssembly module before any other functions are called.","wrong":"import { __wbg_init } from 'lindera-wasm-cjk-bundler';","symbol":"__wbg_init","correct":"import __wbg_init from 'lindera-wasm-cjk-bundler';"},{"note":"The TokenizerBuilder class is a named export used to configure and create tokenizer instances. It should not be imported as a default export.","wrong":"import TokenizerBuilder from 'lindera-wasm-cjk-bundler';","symbol":"TokenizerBuilder","correct":"import { TokenizerBuilder } from 'lindera-wasm-cjk-bundler';"},{"note":"The 'Token' type defines the structure of the morphological analysis results, including properties like 'surface' and 'details'. This is primarily for TypeScript users to ensure type safety.","symbol":"Token","correct":"import type { Token } from 'lindera-wasm-cjk-bundler';"}],"quickstart":{"code":"import __wbg_init, { TokenizerBuilder, Token } from 'lindera-wasm-cjk-bundler';\n\nasync function main() {\n    // Initialize the WebAssembly module. This must be called before using other functions.\n    await __wbg_init();\n\n    // Create a new TokenizerBuilder instance.\n    const builder = new TokenizerBuilder();\n\n    // Set the dictionary to be used. \"embedded://cjk\" specifies the bundled CJK dictionary.\n    builder.setDictionary(\"embedded://cjk\");\n\n    // Set the tokenization mode. \"normal\" is a common choice.\n    builder.setMode(\"normal\");\n\n    // Build the tokenizer instance.\n    const tokenizer = builder.build();\n\n    // Text for morphological analysis (Japanese, Korean, Chinese examples).\n    const text = \"すもももももももものうち。안녕하세요.我爱北京天安门。\";\n\n    // Perform tokenization.\n    const tokens: Token[] = tokenizer.tokenize(text);\n\n    console.log(`Tokens for \"${text}\":`);\n    tokens.forEach(token => {\n        // Log the surface form and detailed analysis of each token.\n        console.log(`- ${token.surface}: ${token.details.join(\", \")}`);\n    });\n\n    // Example with another CJK language: Korean\n    const koreanText = \"한국어를 공부합니다.\";\n    const koreanTokens: Token[] = tokenizer.tokenize(koreanText);\n    console.log(`\\nTokens for \"${koreanText}\":`);\n    koreanTokens.forEach(token => {\n        console.log(`- ${token.surface}: ${token.details.join(\", \")}`);\n    });\n\n    // Example with another CJK language: Chinese\n    const chineseText = \"上海东方明珠广播电视塔\";\n    const chineseTokens: Token[] = tokenizer.tokenize(chineseText);\n    console.log(`\\nTokens for \"${chineseText}\":`);\n    chineseTokens.forEach(token => {\n        console.log(`- ${token.surface}: ${token.details.join(\", \")}`);\n    });\n}\n\nmain().catch(console.error);","lang":"typescript","description":"Initializes the Lindera WASM CJK tokenizer, demonstrates setting the embedded CJK dictionary, tokenizing a multi-language CJK string, and logging the analysis results."},"warnings":[{"fix":"Migrate to `lindera-wasm-cjk-nodejs` for direct Node.js environments or use this `lindera-wasm-cjk-bundler` package with a module bundler (like Webpack, Rollup, esbuild) that targets Node.js. Update import paths and package names in your `package.json` accordingly.","message":"With the release of v3.0.0, the Lindera project introduced significant package naming changes. The previous `lindera-wasm-*` packages that targeted Node.js directly were removed. Users should now explicitly use `lindera-wasm-*-nodejs` for direct Node.js environments or `lindera-wasm-*-bundler` (this package) with a bundler configured for Node.js. Older import paths will no longer resolve.","severity":"breaking","affected_versions":"<3.0.0"},{"fix":"Always call and `await` the `__wbg_init()` function at the start of your application's execution flow before instantiating `TokenizerBuilder` or other Lindera WASM components.","message":"The core WebAssembly module requires asynchronous initialization via the default export `__wbg_init()` before any other functions (like `TokenizerBuilder`) can be used. Forgetting to `await` this function or call it at all will lead to runtime errors.","severity":"gotcha","affected_versions":">=1.0.0"},{"fix":"When targeting web browsers, ensure your project utilizes a module bundler. When targeting Node.js, for direct usage, consider `lindera-wasm-cjk-nodejs`. If using this `bundler` variant for Node.js, ensure your bundler is configured to output Node.js-compatible modules.","message":"This `lindera-wasm-cjk-bundler` package is specifically optimized for use with module bundlers (e.g., Webpack, Rollup, esbuild) for both browser and Node.js targets. Attempting to use it directly in a browser without bundling, or in a Node.js environment without a bundler, might lead to unexpected import resolution issues or runtime errors depending on your Node.js version and module system configuration.","severity":"gotcha","affected_versions":">=2.0.0"},{"fix":"Ensure `builder.setDictionary(\"embedded://cjk\")` is used when utilizing `lindera-wasm-cjk-bundler` to leverage its integrated dictionaries. For other specific dictionary needs, use their corresponding `lindera-wasm-*` packages (e.g., `lindera-wasm-ipadic-bundler`).","message":"When using `lindera-wasm-cjk-bundler`, the dictionary must be specified as `\"embedded://cjk\"` to correctly load the pre-bundled CJK dictionaries. Using other dictionary identifiers (e.g., `\"embedded://ipadic\"`) or attempting to load external dictionaries will fail, as this package is configured with a specific CJK dictionary set.","severity":"gotcha","affected_versions":">=2.0.0"}],"env_vars":null,"last_verified":"2026-04-19T00:00:00.000Z","next_check":"2026-07-18T00:00:00.000Z","problems":[{"fix":"Ensure you `await __wbg_init();` before attempting to use `TokenizerBuilder`. Also, verify `TokenizerBuilder` is imported as a named export: `import { TokenizerBuilder } from 'lindera-wasm-cjk-bundler';`.","cause":"The WebAssembly module was not initialized or not properly awaited, or the TokenizerBuilder symbol was imported incorrectly.","error":"TypeError: Cannot read properties of undefined (reading 'TokenizerBuilder')"},{"fix":"Use ES Module `import` syntax (`import __wbg_init, { TokenizerBuilder } from 'lindera-wasm-cjk-bundler';`) in your project and ensure your environment (e.g., Node.js with `\"type\": \"module\"` in `package.json`, or a bundler) correctly handles ESM.","cause":"Attempting to use CommonJS `require()` syntax with this ES Module package, which is designed for bundlers or native ESM environments.","error":"SyntaxError: require() of ES Module lindera-wasm-cjk-bundler from ... not supported."},{"fix":"For `lindera-wasm-cjk-bundler`, always use `builder.setDictionary(\"embedded://cjk\")`. If you intend to use a specific dictionary like IPADIC, you should use `lindera-wasm-ipadic-bundler` and `builder.setDictionary(\"embedded://ipadic\")`.","cause":"The specified dictionary name does not match the bundled dictionary or there's an issue with the dictionary path.","error":"Error: No such dictionary: embedded://cjk"}],"ecosystem":"npm","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null,"pypi_latest":null,"cli_name":"","cli_version":null}