Misaki G2P Engine

0.9.4 · active · verified Wed Apr 15

Misaki is a Grapheme-to-Phoneme (G2P) engine for Text-to-Speech (TTS) applications, converting written text into phonemes. It primarily supports English with dictionary-based lookups and offers configurable fallbacks, including rule-based systems like `espeak-ng` and optional neural network models. Designed to be lightweight and efficient, Misaki is often integrated into larger TTS systems like Kokoro. The current version is 0.9.4, and the project shows active development with ongoing maintenance and issue resolution on GitHub.

Warnings

Install

Imports

Quickstart

Initializes the Misaki G2P engine for English and processes a sample text to obtain its phonemic representation. It demonstrates basic usage without optional transformer models or external fallbacks, with commented-out code showing how to integrate `espeak-ng` for out-of-dictionary word handling.

from misaki import en

# Initialize G2P for American English, no transformer, no external fallback
g2p = en.G2P(trf=False, british=False, fallback=None)

text = "Misaki is a G2P engine designed for Text-to-Speech models."
phonemes, tokens = g2p(text)

print(f"Text: {text}")
print(f"Phonemes: {phonemes}")
# Example with espeak-ng fallback (requires espeak-ng installed on system)
# from misaki import espeak
# fallback_espeak = espeak.EspeakFallback(british=False)
# g2p_with_fallback = en.G2P(trf=False, british=False, fallback=fallback_espeak)
# text_ood = "Now outofdictionary words are handled by espeak."
# phonemes_ood, _ = g2p_with_fallback(text_ood)
# print(f"Text (OOD): {text_ood}")
# print(f"Phonemes (OOD): {phonemes_ood}")

view raw JSON →