{"id":6719,"library":"misaki","title":"Misaki G2P Engine","description":"Misaki is a Grapheme-to-Phoneme (G2P) engine for Text-to-Speech (TTS) applications, converting written text into phonemes. It primarily supports English with dictionary-based lookups and offers configurable fallbacks, including rule-based systems like `espeak-ng` and optional neural network models. Designed to be lightweight and efficient, Misaki is often integrated into larger TTS systems like Kokoro. The current version is 0.9.4, and the project shows active development with ongoing maintenance and issue resolution on GitHub.","status":"active","version":"0.9.4","language":"en","source_language":"en","source_url":"https://github.com/hexgrad/misaki","tags":["g2p","tts","phonemizer","nlp","speech-synthesis","language-processing"],"install":[{"cmd":"pip install \"misaki[en]\"","lang":"bash","label":"Install for English language support"},{"cmd":"sudo apt-get install espeak-ng","lang":"bash","label":"Install espeak-ng (Debian/Ubuntu for fallback)"}],"dependencies":[{"reason":"Optional but highly recommended external system dependency for robust out-of-dictionary (OOD) word fallback. Installation varies by OS.","package":"espeak-ng","optional":true},{"reason":"Required for `misaki[en]` extra for advanced phonemization capabilities.","package":"phonemizer-fork","optional":false},{"reason":"Pulled in if using `misaki` with transformer-based POS tagging (e.g., `trf=True`) or via downstream dependencies like KittenTTS, which can introduce large `torch`/CUDA installations.","package":"spacy-curated-transformers","optional":true}],"imports":[{"note":"Language-specific G2P engines are imported from submodules like `misaki.en`.","symbol":"G2P","correct":"from misaki import en\ng2p_engine = en.G2P(...)"}],"quickstart":{"code":"from misaki import en\n\n# Initialize G2P for American English, no transformer, no external fallback\ng2p = en.G2P(trf=False, british=False, fallback=None)\n\ntext = \"Misaki is a G2P engine designed for Text-to-Speech models.\"\nphonemes, tokens = g2p(text)\n\nprint(f\"Text: {text}\")\nprint(f\"Phonemes: {phonemes}\")\n# Example with espeak-ng fallback (requires espeak-ng installed on system)\n# from misaki import espeak\n# fallback_espeak = espeak.EspeakFallback(british=False)\n# g2p_with_fallback = en.G2P(trf=False, british=False, fallback=fallback_espeak)\n# text_ood = \"Now outofdictionary words are handled by espeak.\"\n# phonemes_ood, _ = g2p_with_fallback(text_ood)\n# print(f\"Text (OOD): {text_ood}\")\n# print(f\"Phonemes (OOD): {phonemes_ood}\")","lang":"python","description":"Initializes the Misaki G2P engine for English and processes a sample text to obtain its phonemic representation. It demonstrates basic usage without optional transformer models or external fallbacks, with commented-out code showing how to integrate `espeak-ng` for out-of-dictionary word handling."},"warnings":[{"fix":"Use Python 3.10 or 3.11 environments. Check the project's GitHub issues for updated compatibility information before attempting Python 3.13+.","message":"Misaki version 0.9.4 and higher currently do not support Python 3.13, and there have been reports of installation issues with other Python versions. Best compatibility is typically found with Python 3.10 or 3.11.","severity":"breaking","affected_versions":">=0.9.4"},{"fix":"Ensure `espeak-ng` is installed at the system level for your specific OS if you intend to use it as a fallback. Refer to `espeak-ng` documentation for installation instructions.","message":"The `espeak-ng` library, commonly used as a fallback for out-of-dictionary words in Misaki, is an external system dependency. It must be installed separately from the Python package, and its installation method varies by operating system (e.g., `apt-get install espeak-ng` on Debian/Ubuntu). Failure to install it will result in words not found in Misaki's internal dictionaries being spelled out letter-by-letter or marked as unknown.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Consult the `EN_PHONES.md` file in the Misaki GitHub repository to understand the specific phoneme set and its mappings if precise linguistic accuracy is critical. Be aware that the phonemes are optimized for machine learning models.","message":"The phoneme set used by Misaki for English is specifically designed for optimal performance in neural networks and may not strictly adhere to traditional linguistic IPA representations. The author notes that some symbols might be 'butchered or reappropriated'. This can lead to unexpected phoneme mappings for linguists or users expecting strict IPA compliance.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Be mindful of your `pip install` commands. If you do not require transformer-based POS tagging, explicitly try to install `misaki` without extras that might trigger these large dependencies, or consider pruning the environment after installation if the extra packages are not needed by other parts of your project.","message":"Enabling transformer-based POS tagging (`trf=True`) or installing `misaki` as a dependency for other libraries like KittenTTS can pull in heavy dependencies such as `torch` and NVIDIA CUDA packages, potentially adding several gigabytes to the installation size, even if a GPU is not utilized or if `trf=False` is ultimately used for Misaki itself.","severity":"gotcha","affected_versions":"All versions with `spacy-curated-transformers` dependency"},{"fix":"For text with ambiguous homographs that rely on deep semantic understanding, Misaki might produce incorrect phonemes. Users should be aware of this limitation and potentially pre-process such text or manually specify pronunciations using the Markdown-like syntax `[word](/phonemes/)` if high accuracy is required for such cases.","message":"Misaki's current implementation may have limitations in non-POS-based homograph disambiguation (e.g., distinguishing 'graph axes' from 'throwing axes'). While it handles some POS-based disambiguation, more complex contextual disambiguation remains a 'TODO'.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z","problems":[]}