g2p

2.3.1 verified Sat May 09 auth: no python

Context-aware, rule-based grapheme-to-phoneme (G2P) mapping library that preserves indices. Version 2.3.1 (stable, as of 2025). Maintained by NRC-ILT. Monthly releases.

pip install g2p

Common errors

error ImportError: cannot import name 'make_g2p' from 'g2p' ↓

cause Using an older version (1.x) where make_g2p didn't exist or was in a different location.

fix

Upgrade to g2p>=2.0 and use 'from g2p import make_g2p'.

error ModuleNotFoundError: No module named 'networkx' ↓

cause Before v2.1.1, g2p used networkx as a dependency. In v2.1.1+, it was replaced with a custom class, but if you have an old lock file or install from source, networkx may be missing.

fix

Upgrade to g2p>=2.1.1 or install networkx manually: pip install networkx.

Warnings

breaking v2.0 changed the mapping configuration file format and the programmatic API. Old 1.x mappings are incompatible. See migration guide at https://roedoejet.github.io/g2p/latest/migration-2/ ↓

fix Rewrite mappings for v2.x. Use make_g2p() instead of direct G2P class.

deprecated make_g2p(in, out) used to not tokenize by default; now it does. The tok_lang argument is deprecated. ↓

fix Use tokenize=True/False instead of tok_lang. If you need to disable tokenization, pass tokenize=False.

gotcha The first import of g2p loads a 45MB lexicon (for English), causing a ~2s import delay. This is expected. ↓

fix Consider lazy-loading or using a non-English mapping to avoid the delay.

Imports

make_g2p

wrong

from g2p.mappings import make_g2p

correct

from g2p import make_g2p

make_g2p moved to top-level in v2.0

Token
wrong
```
from g2p.tokenizer import Token
```
correct
```
from g2p import Token
```
Token is exported from g2p package since v2.0

Quickstart

Basic usage: load a built-in mapping and convert text.

from g2p import make_g2p
# Create a mapping from English to IPA
mapper = make_g2p('eng-ipa', tokenize=True)
# Convert text
result = mapper('Hello world')
print(result.output_string)
# 'hɛˈloʊ wɜrld' (approximate)