{"id":4202,"library":"pypinyin","title":"Chinese Pinyin Conversion (pypinyin)","description":"pypinyin is a Python library for converting Chinese characters to Pinyin. It intelligently matches the most fitting Pinyin based on phrase occurrences, supports heteronyms (multi-pronunciation characters), simplified/traditional Chinese, Zhuyin, and various Pinyin styles (e.g., tone conventions). The library is actively maintained, with version 0.55.0 released recently, demonstrating a consistent release cadence.","status":"active","version":"0.55.0","language":"en","source_language":"en","source_url":"https://github.com/mozillazg/python-pinyin","tags":["Chinese","Pinyin","NLP","i18n","text-processing"],"install":[{"cmd":"pip install pypinyin","lang":"bash","label":"Install pypinyin"}],"dependencies":[],"imports":[{"note":"Main function for converting Chinese characters to Pinyin, returning a list of lists with tone marks by default.","symbol":"pinyin","correct":"from pypinyin import pinyin"},{"note":"Function for converting Chinese characters to Pinyin, returning a flat list without tone marks or heteronyms by default.","symbol":"lazy_pinyin","correct":"from pypinyin import lazy_pinyin"},{"note":"Enum containing various Pinyin output styles (e.g., TONE, FIRST_LETTER, TONE2, BOPOMOFO).","symbol":"Style","correct":"from pypinyin import Style"}],"quickstart":{"code":"from pypinyin import pinyin, lazy_pinyin, Style\n\nchinese_text = \"你好，世界！\"\n\n# Convert to Pinyin with tone marks (default style)\npinyin_result_toned = pinyin(chinese_text)\nprint(f\"Toned Pinyin: {pinyin_result_toned}\")\n\n# Convert to Pinyin without tone marks (lazy_pinyin)\npinyin_result_lazy = lazy_pinyin(chinese_text)\nprint(f\"Lazy Pinyin: {pinyin_result_lazy}\")\n\n# Convert to Pinyin using first letter style\npinyin_result_first_letter = pinyin(chinese_text, style=Style.FIRST_LETTER)\nprint(f\"First Letter Pinyin: {pinyin_result_first_letter}\")\n\n# Handle heteronyms (multi-pronunciation characters)\nheteronym_text = \"中心\"\npinyin_heteronym = pinyin(heteronym_text, heteronym=True)\nprint(f\"Heteronym Pinyin for '中心': {pinyin_heteronym}\")\n","lang":"python","description":"Demonstrates basic conversion of Chinese characters to Pinyin using `pinyin` and `lazy_pinyin` functions, including handling different styles and heteronyms."},"warnings":[{"fix":"To include neutral tones (as '5'), use `neutral_tone_with_five=True`. To use 'ü' instead of 'v', set `v_to_u=True` when calling `lazy_pinyin` or `pinyin` (for non-tone styles).","message":"By default, pypinyin results do not indicate neutral tones and use 'v' for 'ü'.","severity":"gotcha","affected_versions":"All versions (default behavior)"},{"fix":"If you need 'y', 'w', 'yu' to be counted as initials, pass `strict=False` to the `pinyin` or `lazy_pinyin` function. This is particularly relevant for `Style.INITIALS`.","message":"Standard Pinyin rules state that 'y', 'w', and 'yu' are not syllable initials. By default, pypinyin adheres to this, which might lead to unexpected empty strings for `Style.INITIALS`.","severity":"gotcha","affected_versions":"All versions (default behavior)"},{"fix":"Use the `errors` parameter to control this behavior: `errors='ignore'` to remove them, `errors='replace'` to substitute with Unicode, `errors='exception'` to raise `PinyinNotFoundException`, or provide a callable for custom handling.","message":"When converting text containing characters without Pinyin (e.g., symbols, non-Chinese characters), the default behavior is to return them as-is.","severity":"gotcha","affected_versions":"All versions (default behavior)"},{"fix":"Upgrade to `pypinyin` version 0.52.0 or newer. This version changed data loading from in-memory dicts to JSON files to mitigate the issue.","message":"In Python 3.12, older versions of pypinyin (prior to 0.52.0) experienced significant performance degradation during import, especially in debugging environments or with `pytest --cov`.","severity":"breaking","affected_versions":"<0.52.0 on Python 3.12"},{"fix":"Upgrade to `pypinyin` version 0.53.0 or newer, which includes built-in support for PyInstaller bundling, resolving common data file path issues.","message":"When bundling applications with PyInstaller, older versions of pypinyin might have had issues locating internal data files, leading to `no such file or dictionary: pinyin_dict.json` errors.","severity":"gotcha","affected_versions":"<0.53.0 with PyInstaller"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}