{"id":15686,"library":"lindera-wasm-jieba-bundler","title":"Lindera WASM with Jieba Dictionary (Bundler)","description":"lindera-wasm-jieba-bundler is a specialized npm package that provides a WebAssembly-based morphological analysis library for Chinese language text, specifically utilizing the Jieba dictionary. It is part of the Lindera project, which offers high-performance text segmentation by compiling Rust code to WebAssembly. The current stable version series is `3.x`, with `3.0.5` being the latest release, focusing on safety and refactoring. This particular package is optimized for use within JavaScript bundler environments like Webpack or Rollup, providing a compact and efficient solution for client-side or server-side (via bundlers) Chinese text processing. Lindera's key differentiators include its Rust-based performance, WASM portability across various JavaScript runtimes (browser, Node.js via bundlers), and its modular approach with separate packages for different dictionaries and target environments (web, nodejs, bundler). Release cadence appears to be active, with several minor updates within the 3.0.x series recently.","status":"active","version":"2.3.4","language":"javascript","source_language":"en","source_url":"https://github.com/lindera/lindera","tags":["javascript","morphological","analysis","library","wasm","webassembly","typescript"],"install":[{"cmd":"npm install lindera-wasm-jieba-bundler","lang":"bash","label":"npm"},{"cmd":"yarn add lindera-wasm-jieba-bundler","lang":"bash","label":"yarn"},{"cmd":"pnpm add lindera-wasm-jieba-bundler","lang":"bash","label":"pnpm"}],"dependencies":[],"imports":[{"note":"Required to initialize the WebAssembly module; must be awaited. CommonJS `require` is not supported for bundler targets since v3.0.0.","wrong":"const __wbg_init = require('lindera-wasm-jieba-bundler');","symbol":"__wbg_init","correct":"import __wbg_init from 'lindera-wasm-jieba-bundler';"},{"note":"This is a named export for creating tokenizer instances. The import path is directly from the package name.","wrong":"import TokenizerBuilder from 'lindera-wasm-jieba-bundler'; // Not a default export\nimport { TokenizerBuilder } from 'lindera-wasm-jieba-bundler/lindera_wasm';","symbol":"TokenizerBuilder","correct":"import { TokenizerBuilder } from 'lindera-wasm-jieba-bundler';"},{"note":"Use `import type` for type-only imports to ensure they are stripped during compilation, optimizing bundle size and avoiding runtime errors in some environments.","symbol":"Token","correct":"import type { Token } from 'lindera-wasm-jieba-bundler';"}],"quickstart":{"code":"import __wbg_init, { TokenizerBuilder, type Token } from 'lindera-wasm-jieba-bundler';\n\nasync function main() {\n    // Initialize the WebAssembly module. This must be awaited.\n    await __wbg_init();\n\n    // Create a new TokenizerBuilder instance.\n    const builder = new TokenizerBuilder();\n\n    // Specify the Jieba dictionary to use.\n    // 'embedded://jieba' refers to the dictionary bundled with this package.\n    builder.setDictionary(\"embedded://jieba\");\n\n    // Set the tokenization mode. 'normal' is a common default.\n    builder.setMode(\"normal\");\n\n    // Build the tokenizer.\n    const tokenizer = builder.build();\n\n    // Text to tokenize.\n    const textToAnalyze = \"上海东方明珠广播电视塔\";\n\n    // Perform tokenization.\n    const tokens: Token[] = tokenizer.tokenize(textToAnalyze);\n\n    console.log(`Tokens for: \"${textToAnalyze}\"`);\n    tokens.forEach(token => {\n        // Each token has a surface form and detailed information.\n        console.log(`- ${token.surface}: ${token.details.join(\" | \")}`);\n    });\n\n    // Example with another Chinese sentence\n    const anotherText = \"我爱北京天安门\";\n    const moreTokens: Token[] = tokenizer.tokenize(anotherText);\n    console.log(`\\nTokens for: \"${anotherText}\"`);\n    moreTokens.forEach(token => {\n        console.log(`- ${token.surface}: ${token.details.join(\" | \")}`);\n    });\n}\n\nmain().catch(console.error);","lang":"typescript","description":"This example demonstrates how to initialize the Lindera WASM module, build a tokenizer with the embedded Jieba dictionary, and perform morphological analysis on Chinese text in a bundler environment."},"warnings":[{"fix":"Review the `README.md` for the correct package suffix (`-web`, `-nodejs`, `-bundler`) matching your environment. For Node.js, `lindera-nodejs` is a separate native NAPI-RS binding, and WASM targets for Node.js were deprecated.","message":"Version 3.0.0 introduced significant changes, including the removal of the direct Node.js WASM target and a renaming of npm packages. Users migrating from `v2.x` to `v3.x` should review the new package naming conventions (e.g., `-web`, `-nodejs`, `-bundler`) and adapt their imports accordingly.","severity":"breaking","affected_versions":">=3.0.0"},{"fix":"Ensure you are using the correct package for your environment: `lindera-wasm-<dictionary>-web` for direct browser usage via `<script type=\"module\">`, `lindera-wasm-<dictionary>-nodejs` for Node.js (pre-v3.0.0 WASM, or NAPI-RS for v3.0.0+ `lindera-nodejs`), and `lindera-wasm-<dictionary>-bundler` for projects using bundlers like Webpack or Rollup.","message":"Confusing the `-web`, `-nodejs`, and `-bundler` packages for different environments can lead to runtime errors or suboptimal performance. Each package is optimized for its target environment.","severity":"gotcha","affected_versions":">=2.0.0"},{"fix":"Always prepend `await __wbg_init();` at the beginning of your asynchronous function where you use Lindera, ensuring the WASM module is fully loaded and ready.","message":"The `__wbg_init()` function, which initializes the WebAssembly module, must be called and awaited before any other Lindera WASM functionality can be used. Forgetting to await it will lead to runtime errors.","severity":"gotcha","affected_versions":">=2.0.0"},{"fix":"For `lindera-wasm-jieba-bundler`, always use `builder.setDictionary(\"embedded://jieba\")`. Refer to the package name and documentation for the correct dictionary identifier.","message":"Specifying the correct dictionary via `builder.setDictionary(\"embedded://<dictionary-name>\")` is crucial. Using the wrong identifier (e.g., `\"embedded://ipadic\"` for a `jieba` package) will cause dictionary loading failures.","severity":"gotcha","affected_versions":">=2.0.0"}],"env_vars":null,"last_verified":"2026-04-21T00:00:00.000Z","next_check":"2026-07-20T00:00:00.000Z","problems":[{"fix":"Ensure you have `await __wbg_init();` called before attempting to instantiate `TokenizerBuilder`.","cause":"The WebAssembly module was not initialized correctly, or the `__wbg_init()` promise was not awaited, leading to `TokenizerBuilder` not being fully available.","error":"TypeError: TokenizerBuilder is not a constructor"},{"fix":"Verify that `builder.setDictionary(\"embedded://jieba\")` is used for the `lindera-wasm-jieba-bundler` package. Other dictionary names will not work.","cause":"The dictionary identifier passed to `setDictionary()` does not match the dictionary bundled with the package or is misspelled.","error":"Error: Failed to fetch dictionary"},{"fix":"Switch to ES module `import` statements (e.g., `import { TokenizerBuilder } from 'lindera-wasm-jieba-bundler';`) and ensure your build environment supports ESM.","cause":"Attempting to use CommonJS `require()` syntax with packages intended for ESM (`import`) or bundler environments, especially after `v3.0.0` which consolidated ESM usage.","error":"SyntaxError: require is not defined (for bundler or web packages)"}],"ecosystem":"npm"}