CMU Pronouncing Dictionary Python Wrapper

1.1.3 · active · verified Sun Apr 12

CMUdict (cmudict) is a Python wrapper package for the CMU Pronouncing Dictionary data files, providing access to over 134,000 English words and their ARPAbet pronunciations. It exposes the data with minimal assumptions on its usage. The library is actively maintained with frequent patch releases, often related to dependency updates or minor fixes, and occasional minor version bumps for features like type hints.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the CMU dictionary and retrieve pronunciations for a specific word using `cmudict.dict()`. It also highlights `cmudict.entries()` for iterating through all word-pronunciation pairs.

import cmudict

# Get the full dictionary as a mapping from word to a list of pronunciations
pron_dict = cmudict.dict()

word = "hello"
pronunciations = pron_dict.get(word)

if pronunciations:
    print(f"Pronunciations for '{word}': {pronunciations}")
    # Example: Accessing the first pronunciation and its phonemes
    first_pronunciation_phonemes = pronunciations[0]
    print(f"First pronunciation phonemes: {first_pronunciation_phonemes}")
else:
    print(f"'{word}' not found in CMUdict.")

# To get all entries as (word, pronunciation) tuples (e.g., for iteration)
all_entries = cmudict.entries()
# print(f"Total entries (including variants): {len(all_entries)}")

# Example of getting pronunciations via entries() (less direct for single word lookup)
# target_word = "example"
# example_pronunciations = [p for w, p in all_entries if w == target_word]
# print(f"Pronunciations for '{target_word}' (from entries): {example_pronunciations}")

view raw JSON →