g2p
raw JSON → 2.3.1 verified Sat May 09 auth: no python
Context-aware, rule-based grapheme-to-phoneme (G2P) mapping library that preserves indices. Version 2.3.1 (stable, as of 2025). Maintained by NRC-ILT. Monthly releases.
pip install g2p Common errors
error ImportError: cannot import name 'make_g2p' from 'g2p' ↓
cause Using an older version (1.x) where make_g2p didn't exist or was in a different location.
fix
Upgrade to g2p>=2.0 and use 'from g2p import make_g2p'.
error ModuleNotFoundError: No module named 'networkx' ↓
cause Before v2.1.1, g2p used networkx as a dependency. In v2.1.1+, it was replaced with a custom class, but if you have an old lock file or install from source, networkx may be missing.
fix
Upgrade to g2p>=2.1.1 or install networkx manually: pip install networkx.
Warnings
breaking v2.0 changed the mapping configuration file format and the programmatic API. Old 1.x mappings are incompatible. See migration guide at https://roedoejet.github.io/g2p/latest/migration-2/ ↓
fix Rewrite mappings for v2.x. Use make_g2p() instead of direct G2P class.
deprecated make_g2p(in, out) used to not tokenize by default; now it does. The tok_lang argument is deprecated. ↓
fix Use tokenize=True/False instead of tok_lang. If you need to disable tokenization, pass tokenize=False.
gotcha The first import of g2p loads a 45MB lexicon (for English), causing a ~2s import delay. This is expected. ↓
fix Consider lazy-loading or using a non-English mapping to avoid the delay.
Imports
- make_g2p wrong
from g2p.mappings import make_g2pcorrectfrom g2p import make_g2p - Token wrong
from g2p.tokenizer import Tokencorrectfrom g2p import Token
Quickstart
from g2p import make_g2p
# Create a mapping from English to IPA
mapper = make_g2p('eng-ipa', tokenize=True)
# Convert text
result = mapper('Hello world')
print(result.output_string)
# 'hɛˈloʊ wɜrld' (approximate)