Lindera WASM (IPADIC) for Bundlers
This package, `lindera-wasm-ipadic-bundler`, provides a WebAssembly-based Japanese morphological analyzer specifically tailored for JavaScript bundler environments. It includes the widely used IPADIC dictionary directly embedded, enabling offline and efficient text processing. While the broader Lindera WASM ecosystem has moved to v3.x (e.g., `lindera-wasm-ipadic-web` and `lindera-wasm-ipadic-nodejs`), this particular 'bundler' target package is currently at v2.3.4. The Lindera project generally releases frequent minor updates and patches, with major versions introducing broader architectural changes, such as revised package structures. Its key differentiators include its WebAssembly foundation for performance, the convenience of bundled dictionaries, and dedicated packages optimized for different JavaScript environments (browser, Node.js, bundler).
Common errors
-
ReferenceError: require is not defined
cause Using the CommonJS `require()` syntax with a bundler-targeted package in an ECMAScript Module (ESM) context, or using a `-bundler` package in a Node.js CJS project.fixFor bundler environments, use `import init, { TokenizerBuilder } from 'lindera-wasm-ipadic-bundler';`. If targeting Node.js, use `lindera-wasm-ipadic-nodejs` with `const { TokenizerBuilder } = require('lindera-wasm-ipadic-nodejs');`. -
TypeError: init is not a function
cause Incorrectly importing the default `init` function as a named export, or attempting to call it before it's properly assigned (e.g., due to an incorrect bundler configuration).fixEnsure `init` is imported as the default export: `import init from 'lindera-wasm-ipadic-bundler';`. Also, verify your bundler is correctly handling default WASM module exports. -
Error: The requested dictionary 'embedded://ipadic' is not found.
cause The specified dictionary path (`embedded://ipadic`) is incorrect, or a Lindera WASM package without an embedded dictionary was installed (e.g., `lindera-wasm-bundler` instead of `lindera-wasm-ipadic-bundler`).fixDouble-check the dictionary path. Ensure you have installed the correct package that bundles the desired dictionary, such as `lindera-wasm-ipadic-bundler` for IPADIC. -
TypeError: Failed to execute 'fetch' on 'WorkerGlobalScope': Failed to fetch dynamically imported module
cause This usually occurs in browser/web worker environments when a bundler fails to correctly output or serve the WASM `.wasm` chunk file, or the server path for the WASM file is incorrect.fixVerify your bundler (e.g., Webpack, Rollup) is configured to handle `.wasm` files. Ensure the output directory of your build includes the `.wasm` asset and that your web server is correctly serving it with the appropriate MIME type (`application/wasm`).
Warnings
- breaking The `lindera-wasm-ipadic-bundler` package is currently at v2.3.4, while other Lindera WASM packages (e.g., `-web`, `-nodejs`) have progressed to v3.x. This means features, bug fixes, and breaking changes introduced in Lindera v3 are not yet present in this specific 'bundler' package. Users expecting v3 behavior will not find it here.
- gotcha Lindera provides separate npm packages for different JavaScript environments (e.g., `-web`, `-nodejs`, `-bundler`). Using the incorrect package for your target environment (e.g., `lindera-wasm-ipadic-nodejs` in a browser bundler context) will lead to runtime errors due to environment-specific WASM compilation targets and API differences.
- gotcha The WebAssembly module must be asynchronously initialized by calling `await init()` (where `init` is the default export) before any other functions from the package can be used. Forgetting to call or await this initialization function will result in runtime errors.
- gotcha WASM module loading and initialization are asynchronous operations. Attempting to use the `TokenizerBuilder` or other functionality synchronously before `init()` has resolved will lead to errors.
Install
-
npm install lindera-wasm-ipadic-bundler -
yarn add lindera-wasm-ipadic-bundler -
pnpm add lindera-wasm-ipadic-bundler
Imports
- init
import { init, TokenizerBuilder } from 'lindera-wasm-ipadic-bundler';import init, { TokenizerBuilder } from 'lindera-wasm-ipadic-bundler'; - TokenizerBuilder
import TokenizerBuilder from 'lindera-wasm-ipadic-bundler';
import { TokenizerBuilder } from 'lindera-wasm-ipadic-bundler'; - Token
import type { Token } from 'lindera-wasm-ipadic-bundler';
Quickstart
import init, { TokenizerBuilder } from 'lindera-wasm-ipadic-bundler';
async function main() {
// Initialize the WebAssembly module. This is crucial and must be awaited
// before any other functions from the module can be used.
await init();
// Create a new TokenizerBuilder instance to configure the tokenizer.
const builder = new TokenizerBuilder();
// Specify the dictionary to use. 'embedded://ipadic' uses the dictionary
// bundled with this package.
builder.setDictionary("embedded://ipadic");
// Set the tokenization mode, 'normal' is suitable for general text.
builder.setMode("normal");
// Build the tokenizer with the specified settings.
const tokenizer = builder.build();
// Define the Japanese sentence to be tokenized.
const sentence = "すもももももももものうち";
const tokens = tokenizer.tokenize(sentence);
console.log(`Tokenizing sentence: "${sentence}"`);
console.log("--- Tokens ---");
// Iterate over the resulting tokens and print their surface form and details.
tokens.forEach(token => {
console.log(`${token.surface} [${token.details.join(", ")}]`);
});
// Demonstrate accessing specific details of a token.
if (tokens.length > 0) {
const firstToken = tokens[0];
console.log(`\nFirst token surface: ${firstToken.surface}`);
// The details array contains information like part-of-speech, conjugation, etc.
console.log(`First token part of speech: ${firstToken.details[0]}`);
}
}
// Execute the main asynchronous function.
main().catch(console.error);