pykakasi

2.3.0 · active · verified Sat Apr 11

pykakasi is a Python Natural Language Processing (NLP) library that transliterates Japanese text (hiragana, katakana, and kanji) into rōmaji (Latin/Roman alphabet). It supports NFC-normalized characters and is based on the C-language kakasi library. The current version is 2.3.0, and while there isn't a strict release cadence, updates are made as needed.

Warnings

Install

Imports

Quickstart

Initializes the kakasi converter and demonstrates both the deprecated (v1.x) and the recommended (v2.x) API for converting Japanese text into romaji, hiragana, and katakana. The new API offers a more streamlined `convert()` method.

import pykakasi

kks = pykakasi.kakasi()
text = "かな漢字交じり文"

# Configure conversion modes (optional, defaults to Hepburn romaji, no spaces)
kks.setMode('H', 'a') # Hiragana to romaji
kks.setMode('K', 'a') # Katakana to romaji
kks.setMode('J', 'a') # Kanji to romaji
kks.setMode('r', 'Hepburn') # Use Hepburn Romanization
kks.setMode('s', True) # Add spaces
kks.setMode('C', True) # Capitalize

converter = kks.getConverter()
result_old_api = converter.do(text)
print(f"Old API result: {result_old_api}")

# Recommended new API (v2.0.0+)
result_new_api = pykakasi.kakasi().convert(text)
print("\nNew API result (default modes):")
for item in result_new_api:
    print(f"Original: {item['orig']}, Kana: {item['kana']}, Hiragana: {item['hira']}, Romaji: {item['hepburn']}")

view raw JSON →