{"id":6936,"library":"unidic-lite","title":"UniDic-lite","description":"unidic-lite is a small version of UniDic, a Japanese morphological analysis dictionary, packaged for Python. It is designed to be installable directly via pip without requiring additional downloads, unlike the larger 'unidic' package. It uses UniDic 2.1.2 from 2013 and occupies approximately 250MB of disk space after installation. The current version is 1.0.8, released in January 2021, and its release cadence is infrequent as it primarily serves as a static dictionary resource.","status":"maintenance","version":"1.0.8","language":"en","source_language":"en","source_url":"https://github.com/polm/unidic-lite","tags":["japanese","nlp","dictionary","mecab","tokenization","morphological analysis"],"install":[{"cmd":"pip install unidic-lite","lang":"bash","label":"Install UniDic-lite"}],"dependencies":[{"reason":"Required to use the UniDic dictionary for Japanese morphological analysis; unidic-lite only provides the dictionary data.","package":"fugashi","optional":false},{"reason":"Alternative to fugashi; a Python wrapper for MeCab, necessary to perform morphological analysis using the unidic-lite dictionary.","package":"mecab-python3","optional":false}],"imports":[{"note":"unidic-lite exposes the dictionary directory path via DICDIR. It does not provide classes for direct morphological analysis.","symbol":"DICDIR","correct":"import unidic_lite; print(unidic_lite.DICDIR)"}],"quickstart":{"code":"import unidic_lite\nfrom fugashi import Tagger\n\n# unidic-lite needs to be explicitly passed to the Tagger\ntagger = Tagger(f'-d \"{unidic_lite.DICDIR}\"')\n\ntext = \"すもももももももものうち\"\n\n# Analyze the text\nwords = []\nfor word in tagger(text):\n    words.append(f'{word.surface}\\t{word.feature.pos1}\\t{word.feature.lemma}')\n\nprint('\\n'.join(words))","lang":"python","description":"This quickstart demonstrates how to use `unidic-lite` with the `fugashi` library, a common MeCab wrapper. `unidic-lite.DICDIR` provides the path to the installed dictionary, which must be passed to the `Tagger` initialization."},"warnings":[{"fix":"Install a MeCab wrapper (e.g., `pip install fugashi`) and pass `unidic_lite.DICDIR` to its Tagger constructor.","message":"unidic-lite is solely a dictionary resource. To perform Japanese morphological analysis, a separate MeCab wrapper library like `fugashi` or `mecab-python3` must be installed and used in conjunction with unidic-lite.","severity":"gotcha","affected_versions":"All"},{"fix":"Ensure sufficient disk space is available before installation. Consider using `unidic-lite-imitator` for significantly smaller footprint if suitable.","message":"Despite 'lite' in its name, unidic-lite requires approximately 250MB of disk space for the dictionary data after installation.","severity":"gotcha","affected_versions":"All"},{"fix":"For the most up-to-date dictionary, consider using the `unidic` package (requires a separate download step and more disk space) or explore alternatives.","message":"unidic-lite is based on UniDic 2.1.2 from 2013. This older version may lack vocabulary for modern terms and phrases compared to the full `unidic` package (which uses UniDic 3.1.0 and is much larger).","severity":"gotcha","affected_versions":"All"},{"fix":"Be aware of these differences; if strict adherence to official UniDic behavior is required, review the specific changes or use an unmodified UniDic distribution.","message":"The unidic-lite dictionary has minor modifications from the official UniDic release, including added entries for '令和', removal of single-character numeric and alphabetic words, and changes to `unk.def`. These might lead to slightly different tokenization results compared to an unmodified UniDic.","severity":"gotcha","affected_versions":"All"},{"fix":"Always explicitly pass `unidic_lite.DICDIR` to the MeCab Tagger constructor (as shown in quickstart). For persistent issues, consult the MeCab wrapper's documentation, ensure MeCab is correctly installed on your system, and check system-specific dependencies.","message":"Users frequently encounter 'Failed initializing MeCab' errors when using MeCab wrappers with unidic-lite. This often stems from the wrapper failing to locate the dictionary or underlying MeCab installation issues (e.g., missing C++ redistributables on Windows).","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z","problems":[]}