mojimoji
mojimoji is a Cython-based Python library designed for fast conversion between Japanese half-width (hankaku) and full-width (zenkaku) characters. The current version is 0.0.13, and it receives updates for bug fixes and Python version compatibility, though major feature releases are infrequent.
Common errors
-
ModuleNotFoundError: No module named 'mojimoji'
cause The `mojimoji` Python package is not installed in the current environment or the Python interpreter cannot find it.fixInstall the library using pip: `pip install mojimoji`. -
Characters (e.g., English letters, numbers) are not converting or are converting unexpectedly alongside Japanese characters.
cause By default, `mojimoji` converts all supported character categories (kana, digits, ASCII). Users often expect selective conversion without explicitly specifying parameters.fixUse the `kana=False`, `digit=False`, or `ascii=False` arguments within `zen_to_han()` or `han_to_zen()` to prevent conversion for specific character types. For example, `mojimoji.zen_to_han('ABC123', ascii=False)` will keep 'ABC' as full-width while converting '123'. -
Error during installation: `error: command 'gcc' failed with exit status 1` or similar compilation errors on specific platforms.
cause As a Cython-based library, `mojimoji` requires a C/C++ compiler toolchain (like `gcc` on Linux/macOS or MSVC on Windows) to be available during installation if a pre-compiled wheel is not available for your specific Python version and architecture.fixEnsure you have the necessary build tools installed. For Windows, install 'Build Tools for Visual Studio'. For Linux, install `build-essential` (e.g., `sudo apt-get install build-essential`). For macOS, install Xcode Command Line Tools (`xcode-select --install`).
Warnings
- gotcha By default, `zen_to_han` and `han_to_zen` convert all supported character types (kana, digits, ASCII letters). If you need to convert only specific types (e.g., only kana but not digits), you must explicitly set `kana=False`, `digit=False`, or `ascii=False` for the types you want to exclude from conversion.
- gotcha This Python `mojimoji` library is specifically for Japanese half-width and full-width character conversion. Be aware that other projects, concepts, or applications (e.g., a Ruby gem, a Discord bot, a social app, or the general Japanese word 'moji moji' meaning fidgeting) share a similar name but are entirely unrelated.
- bug Backslash characters ('\' and '\') were not handled properly during conversion in versions prior to 0.0.13, leading to incorrect output for strings containing them.
Install
-
pip install mojimoji
Imports
- mojimoji
from mojimoji import zen_to_han
import mojimoji
Quickstart
import mojimoji
# Convert full-width to half-width characters
zenkaku_text = 'アイウabc012'
hankaku_text = mojimoji.zen_to_han(zenkaku_text)
print(f"'{zenkaku_text}' (full-width) -> '{hankaku_text}' (half-width)")
# Convert half-width to full-width characters
hankaku_text_2 = 'アイウabc012'
zenkaku_text_2 = mojimoji.han_to_zen(hankaku_text_2)
print(f"'{hankaku_text_2}' (half-width) -> '{zenkaku_text_2}' (full-width)")
# Selective conversion: convert only digits to half-width
selective_zenkaku = '漢字123ひらがなABC'
selective_hankaku = mojimoji.zen_to_han(selective_zenkaku, kana=False, ascii=False)
print(f"'{selective_zenkaku}' (selective zen->han) -> '{selective_hankaku}'")