English Grapheme To Phoneme Conversion

2.1.0 · maintenance · verified Mon Apr 13

g2p-en is a Python module designed for converting English graphemes (spelling) to phonemes (pronunciation). It is essential for tasks like speech synthesis. The library uses a combination of dictionary lookups, part-of-speech tagging for homograph disambiguation, and a neural network (using NumPy for inference as of v2.0) for out-of-vocabulary words. The current version is 2.1.0, released in late 2019, and its release cadence appears to be infrequent.

Warnings

Install

Imports

Quickstart

Initializes the G2p converter and processes a list of English sentences, demonstrating handling of numbers, abbreviations, homographs, and out-of-vocabulary words.

from g2p_en import G2p

texts = [
    "I have $250 in my pocket.", # number -> spell-out
    "popular pets, e.g. cats and dogs", # e.g. -> for example
    "I refuse to collect the refuse around here.", # homograph
    "I'm an activationist." # newly coined word
]

g2p = G2p()
for text in texts:
    out = g2p(text)
    print(out)

view raw JSON →