{"id":9877,"library":"konlpy","title":"KoNLPy","description":"KoNLPy (Korean Natural Language Processing in Python) is a Python package designed for Korean text analysis. It provides a consistent API for various popular Korean NLP tools written primarily in Java, including Hannanum, Kkma, Komoran, Mecab (unsupported on Windows), and Okt (Open Korean Text). The current version is 0.6.0, and while its release cadence is irregular, the library is actively maintained to integrate new upstream NLP tools.","status":"active","version":"0.6.0","language":"en","source_language":"en","source_url":"https://github.com/konlpy/konlpy","tags":["NLP","Korean","text processing","tokenization","morphological analysis","POS tagging"],"install":[{"cmd":"pip install konlpy","lang":"bash","label":"Install KoNLPy"}],"dependencies":[],"imports":[{"symbol":"Okt","correct":"from konlpy.tag import Okt"},{"symbol":"Kkma","correct":"from konlpy.tag import Kkma"},{"symbol":"Komoran","correct":"from konlpy.tag import Komoran"},{"note":"Mecab is not officially supported on Windows and requires additional system-level installation (e.g., `python-mecab-ko` and `mecab-ko-dic`) on other OSes.","symbol":"Mecab","correct":"from konlpy.tag import Mecab"},{"symbol":"Hannanum","correct":"from konlpy.tag import Hannanum"}],"quickstart":{"code":"from konlpy.tag import Okt\n\nokt = Okt()\ntext = \"아버지가 방에 들어가신다.\"\n\nprint(f\"Original text: {text}\")\nprint(f\"Tokenization: {okt.morphs(text)}\")\nprint(f\"Part-of-speech tagging: {okt.pos(text)}\")\nprint(f\"Nouns: {okt.nouns(text)}\")","lang":"python","description":"This quickstart demonstrates basic Korean text processing using the `Okt` (Open Korean Text) tagger, including morphological analysis, part-of-speech tagging, and noun extraction. Ensure a JDK is installed and configured for this to run."},"warnings":[{"fix":"Install a JDK 8+ (e.g., OpenJDK). Set the `JAVA_HOME` environment variable to your JDK installation directory and add `%JAVA_HOME%\\bin` (Windows) or `$JAVA_HOME/bin` (Linux/macOS) to your system's PATH. Restart your terminal/IDE after configuration.","message":"KoNLPy relies on Java-based NLP tools and therefore requires a Java Development Kit (JDK) 8 or higher to be installed and properly configured in your system's PATH. Without a correctly configured JVM, most taggers will fail to initialize or run, often with `OSError: 'JVM' is not running.`","severity":"breaking","affected_versions":"All versions"},{"fix":"On Windows, avoid using `Mecab` and opt for alternative taggers like `Okt`, `Komoran`, `Kkma`, or `Hannanum`. On Linux/macOS, follow the specific system-level installation instructions (e.g., `pip install python-mecab-ko` and `mecab-ko-dic`) provided on the KoNLPy GitHub repository.","message":"The Mecab tagger (`konlpy.tag.Mecab`) is not officially supported on Windows due to its reliance on a C++ library and specific dictionary setup. Installation on non-Linux/macOS systems can be very challenging and prone to errors.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Increase the JVM's maximum heap space by setting the `_JAVA_OPTIONS` environment variable before running your Python script. For example, `export _JAVA_OPTIONS=\"-Xmx4g\"` (Linux/macOS) or `set _JAVA_OPTIONS=\"-Xmx4g\"` (Windows) will allocate 4GB. Adjust the `Xmx` value as needed.","message":"Processing very large texts or numerous documents concurrently can lead to `java.lang.OutOfMemoryError: Java heap space` errors. This is due to the underlying JVM's default memory limits.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Install JDK 8 or higher (e.g., OpenJDK). Set the `JAVA_HOME` environment variable to the root directory of your JDK installation (e.g., `C:\\Program Files\\Java\\jdk-11` or `/usr/lib/jvm/java-11-openjdk`). Add the `bin` subdirectory of your JDK (`%JAVA_HOME%\\bin` or `$JAVA_HOME/bin`) to your system's PATH environment variable. Restart your terminal/IDE.","cause":"The Java Development Kit (JDK) is either not installed, not found in the system's PATH, or the `JAVA_HOME` environment variable is not correctly set. The underlying JPype library cannot locate an available Java Virtual Machine.","error":"OSError: 'JVM' is not running."},{"fix":"If on Windows, consider using `Okt`, `Komoran`, `Kkma`, or `Hannanum` instead of `Mecab`. On Linux/macOS, ensure Mecab and its dictionary (`mecab-ko-dic`) are correctly installed by following specific instructions for your OS, which typically involves installing `python-mecab-ko`.","cause":"The Mecab dictionary required by the `Mecab` tagger is missing or incorrectly configured. This often happens on systems where Mecab is not fully supported (like Windows) or if its complex dependencies weren't met during installation.","error":"FileNotFoundError: 'mecab-ko-dic' not found."},{"fix":"Increase the JVM's maximum heap size. You can do this by setting the `_JAVA_OPTIONS` environment variable before running your Python script. For example, to allocate 4GB of memory, use `export _JAVA_OPTIONS=\"-Xmx4g\"` on Linux/macOS or `set _JAVA_OPTIONS=\"-Xmx4g\"` on Windows.","cause":"The Java Virtual Machine (JVM) that KoNLPy uses has exhausted its allocated memory. This usually occurs when processing extremely long texts or a large volume of data.","error":"java.lang.OutOfMemoryError: Java heap space"},{"fix":"Ensure you import the tagger class correctly. The standard way is `from konlpy.tag import Okt` and then `okt = Okt()`. Alternatively, you can import the submodule as `from konlpy import tag` and then use `tagger = tag.Okt()`.","cause":"This error occurs when attempting to call a tagger directly from the top-level `konlpy` module (e.g., `konlpy.tag.Okt()`) without correctly importing the `tag` submodule or the specific tagger class.","error":"AttributeError: module 'konlpy' has no attribute 'tag'"}]}