{"id":7439,"library":"nagisa","title":"Nagisa: Japanese Tokenizer and POS Tagger","description":"Nagisa is a Python module for Japanese word segmentation and Part-of-Speech (POS) tagging. It is built upon recurrent neural networks, leveraging both character- and word-level features for segmentation and tag dictionary information for POS tagging. Designed to be simple and easy to use, the library is actively maintained with version 0.2.12 as of February 2026, receiving periodic updates to address bugs and improve performance.","status":"active","version":"0.2.12","language":"en","source_language":"en","source_url":"https://github.com/taishi-i/nagisa","tags":["Japanese","NLP","tokenizer","POS-tagging","word segmentation","DyNet"],"install":[{"cmd":"pip install nagisa","lang":"bash","label":"Latest stable version"}],"dependencies":[{"reason":"Internal compatibility layer.","package":"six"},{"reason":"Numerical operations, particularly for DyNet backend.","package":"numpy"},{"reason":"Dynamic Neural Network Toolkit backend for Python 3.8+ (for older Python versions, 'DyNet' is used). This is the core machine learning library for tokenization and POS-tagging models.","package":"DyNet38"}],"imports":[{"note":"The primary module for accessing tokenization and tagging functions.","symbol":"nagisa","correct":"import nagisa"},{"note":"While `nagisa.Tagger()` can be used for custom models, `nagisa.tagging()` directly uses the default pre-trained model for convenience and is the most common use case.","wrong":"tagger = nagisa.Tagger(); words = tagger.tagging(text)","symbol":"nagisa.tagging","correct":"words = nagisa.tagging(text)"}],"quickstart":{"code":"import nagisa\n\ntext = 'Pythonで簡単に使えるツールです'\n\n# Perform word segmentation and POS tagging\nwords = nagisa.tagging(text)\n\nprint(words) # => Python/名詞 で/助詞 簡単/形状詞 に/助動詞 使える/動詞 ツール/名詞 です/助動詞\nprint(words.words) # => ['Python', 'で', '簡単', 'に', '使える', 'ツール', 'です']\nprint(words.postags) # => ['名詞', '助詞', '形状詞', '助動詞', '動詞', '名詞', '助動詞']\n\n# Example of post-processing: extract only nouns\nnouns = nagisa.extract(text, extract_postags=['名詞'])\nprint(nouns) # => Python/名詞 ツール/名詞","lang":"python","description":"This example demonstrates how to perform basic Japanese word segmentation and Part-of-Speech tagging using the `nagisa.tagging()` function, and how to access the segmented words and their POS tags. It also shows a simple post-processing step to extract words of a specific POS tag."},"warnings":[{"fix":"Upgrade `nagisa` to version 0.2.11 or newer. If using an older version, ensure your `pip`, `wheel`, and `build` tools are up-to-date before attempting to install from source, or explicitly install dependencies like `DyNet38` first.","message":"Versions of `nagisa` prior to 0.2.11 had issues with Poetry installations due to missing `Requires-Dist` metadata (for `six`, `numpy`, and `DyNet`/`DyNet38`) in the PyPI `tar.gz` files.","severity":"breaking","affected_versions":"<0.2.11"},{"fix":"Upgrade `nagisa` to version 0.2.10 or newer, which includes a fix to suppress `DyNet` logs.","message":"In versions prior to 0.2.10, importing `nagisa` on Linux with Python 3.8 and above would often print verbose `DyNet` logging messages to the console during initialization.","severity":"gotcha","affected_versions":"<0.2.10"},{"fix":"Upgrade `nagisa` to version 0.2.12 or newer to ensure complete tokenization results.","message":"Version 0.2.12 fixed an issue where `nagisa` would silently drop certain words from tokenization results if the input text ended with a partially formed word (i.e., a sequence starting with a BEGIN tag but missing an END tag).","severity":"gotcha","affected_versions":"<0.2.12"},{"fix":"Upgrade `nagisa` to version 0.2.7 or newer, which renamed the conflicting file to `nagisa_utils.pyx`.","message":"Versions prior to 0.2.7 could raise an `AttributeError: module 'utils' has no attribute 'OOV'` or similar, due to a naming conflict with the internal `utils.pyx` file.","severity":"breaking","affected_versions":"<0.2.7"},{"fix":"Ensure you are using `nagisa` 0.2.8+ which added wheels for newer Python versions, and that `DyNet38` is installed. If issues persist, try `pip install DyNet38` before `pip install nagisa`.","message":"Older versions of `nagisa` (especially before widespread `DyNet38` wheels) could be difficult to install on Python 3.8+ due to `DyNet`'s lack of pre-built wheels for these newer Python versions.","severity":"gotcha","affected_versions":"<0.2.10 (especially pre-0.2.8 for broader wheel support)"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Upgrade `nagisa` to 0.2.11 or newer: `pip install nagisa==0.2.12`. If you must use an older version, ensure your Poetry environment's `pip`, `wheel`, and `build` packages are fully up-to-date, or consider installing `nagisa` and its direct dependencies (`six`, `numpy`, `DyNet38`) manually with `pip` first.","cause":"Versions prior to 0.2.11 had incorrect or missing dependency metadata (`Requires-Dist`) in the `tar.gz` files on PyPI, which Poetry relies on.","error":"Poetry could not find a compatible version for package nagisa"},{"fix":"Upgrade `nagisa` to version 0.2.7 or later: `pip install nagisa==0.2.12`. This version renamed the internal file to `nagisa_utils.pyx` to avoid conflicts.","cause":"In `nagisa` versions before 0.2.7, an internal module was named `utils.pyx`, which could conflict with other `utils` modules in the Python path or cause import errors.","error":"AttributeError: module 'utils' has no attribute 'OOV'"},{"fix":"First, ensure you are using a Python version officially supported by the latest `nagisa` release (check PyPI for supported Python versions like 3.9-3.14). Try `pip install DyNet38` independently. If this fails, ensure you have necessary C++ build tools (e.g., Xcode Command Line Tools on macOS, Visual C++ Build Tools on Windows, `build-essential` on Linux). If problems persist, consider installing `DyNet` directly from its GitHub repository following its specific build instructions, then install `nagisa`.","cause":"`DyNet` (or `DyNet38`) is a complex C++-backed library that `nagisa` depends on. Installation can fail if pre-built wheels are not available for your specific Python version and OS, or if build tools are missing.","error":"ImportError: cannot import name 'DyNet' from 'dynet' (or similar DyNet build failures)"},{"fix":"Upgrade `nagisa` to version 0.2.12 or newer: `pip install nagisa==0.2.12`.","cause":"A bug in `nagisa` versions prior to 0.2.12 caused words to be silently dropped if the input text ended with an incomplete word structure detected by the tokenizer.","error":"Some words are missing from the tokenized output, especially at the end of the text."}]}