Double Metaphone for Python
The `doublemetaphone` library provides a Python wrapper for the C++ implementation of the Double Metaphone phonetic algorithm. This algorithm generates approximate phonetic representations (codes) for words, useful for matching names or words with similar pronunciations despite different spellings. The current version is 1.2. Releases appear to be infrequent, with significant gaps between updates.
Warnings
- gotcha The `doublemetaphone` function returns a tuple of two strings (primary and secondary codes), not a single string. Ensure your code handles both return values, even if one is an empty string.
- breaking Older versions (primarily Python 2) might have raised `TypeError` if non-unicode strings were passed. Python 3 strings are Unicode by default, but be mindful of encoding if dealing with byte strings or legacy data.
- gotcha There are multiple Python packages implementing or wrapping Metaphone/Double Metaphone algorithms (e.g., `metaphone`, `phonetics`, `doublemetaphone`). Ensure you install and import the correct package for the C++ wrapper implementation (`doublemetaphone`).
- gotcha Installation might require a C++ compiler if a pre-built wheel is not available for your specific Python version and operating system. This is common for Python wrappers around C/C++ libraries.
Install
-
pip install doublemetaphone
Imports
- doublemetaphone
import doublemetaphone; doublemetaphone.doublemetaphone.doublemetaphone('word')from doublemetaphone import doublemetaphone
Quickstart
from doublemetaphone import doublemetaphone
word1 = "Smith"
word2 = "Schmidt"
primary1, secondary1 = doublemetaphone(word1)
primary2, secondary2 = doublemetaphone(word2)
print(f"'{word1}': Primary='{primary1}', Secondary='{secondary1}'")
print(f"'{word2}': Primary='{primary2}', Secondary='{secondary2}'")
# Double Metaphone often returns two codes to account for phonetic ambiguities.
# For example, 'Smith' and 'Schmidt' share a secondary code.
if primary1 == primary2 or secondary1 == secondary2 or primary1 == secondary2 or primary2 == secondary1:
print(f"'{word1}' and '{word2}' are phonetically similar.")