Korean Grapheme-to-Phoneme (g2pkk)

0.1.2 · active · verified Fri Apr 17

g2pkk is a Grapheme-to-Phoneme (G2P) conversion module specifically designed for Korean text, aiming for cross-platform compatibility. It is currently at version 0.1.2 and appears to have an active, though not rapid, release cadence given its early development stage.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the G2pkk converter and use it to transform Korean text into its romanized phonetic representation. It includes a crucial step to ensure the NLTK 'punkt' resource is downloaded, which is a common prerequisite for many NLP libraries.

import nltk
from g2pkk import G2pkk

# Important: Download NLTK 'punkt' resource if not already done
try:
    nltk.data.find('tokenizers/punkt')
except nltk.downloader.DownloadError:
    print("Downloading NLTK 'punkt' resource...")
    nltk.download('punkt')

# Initialize the G2P converter
g2p = G2pkk()

# Convert Korean text to its phoneme representation
text = "안녕하세요 g2p 입니다. 반갑습니다. 123."
result = g2p(text)
print(f"Original: {text}")
print(f"Phonetic: {result}")

# Example with specific romanization
text_roman = g2p("한국어")
print(f"'한국어' phonetic: {text_roman}")

view raw JSON →