{"id":6054,"library":"pyicu","title":"PyICU","description":"PyICU is a Python extension that wraps the International Components for Unicode (ICU) C++ libraries. It provides robust, full-featured Unicode and globalization support for applications, handling tasks such as locale-aware text formatting, collation, character set conversions, and boundary analysis. The current version is 2.16.2, and the library maintains an active release cadence, often aligning with updates to the underlying ICU C++ library.","status":"active","version":"2.16.2","language":"en","source_language":"en","source_url":"https://gitlab.pyicu.org/main/pyicu","tags":["i18n","unicode","localization","icu","cffi","internationalization"],"install":[{"cmd":"pip install pyicu","lang":"bash","label":"PyPI (requires pre-installed ICU C++ library)"},{"cmd":"brew install pkg-config icu4c\nexport PATH=\"$(brew --prefix)/opt/icu4c/bin:$(brew --prefix)/opt/icu4c/sbin:$PATH\"\nexport PKG_CONFIG_PATH=\"$PKG_CONFIG_PATH:$(brew --prefix)/opt/icu4c/lib/pkgconfig\"\npip install pyicu","lang":"bash","label":"macOS with Homebrew"},{"cmd":"apt-get install python3-icu","lang":"bash","label":"Debian/Ubuntu (system package)"}],"dependencies":[{"reason":"PyICU is a wrapper around the native ICU C++ libraries; these must be installed on the system for PyICU to function. The specific installation method varies by OS (e.g., Homebrew on macOS, apt on Debian/Ubuntu).","package":"ICU C++ libraries","optional":false},{"reason":"Used during compilation to locate the installed ICU C++ libraries.","package":"pkg-config","optional":false}],"imports":[{"symbol":"Locale","correct":"from icu import Locale"},{"symbol":"UnicodeString","correct":"from icu import UnicodeString"},{"symbol":"BreakIterator","correct":"from icu import BreakIterator"},{"symbol":"Collator","correct":"from icu import Collator"}],"quickstart":{"code":"import icu\n\n# Example 1: Locale-aware display name\nlocale = icu.Locale('pt_BR')\nname = locale.getDisplayName()\nprint(f\"Locale display name: {name}\")\n\n# Example 2: Text segmentation (grapheme clusters)\ntext_to_segment = \"café emoji 👨‍👩‍👧‍👦\"\nbreaker = icu.BreakIterator.createCharacterInstance(icu.Locale())\nbreaker.setText(text_to_segment)\n\ngrapheme_clusters = []\ni = 0\nfor j in breaker:\n    grapheme_clusters.append(text_to_segment[i:j])\n    i = j\n\nprint(f\"Grapheme clusters for '{text_to_segment}': {grapheme_clusters}\")","lang":"python","description":"This quickstart demonstrates how to create a `Locale` object to get a locale's display name and how to use `BreakIterator` for locale-aware text segmentation into grapheme clusters. PyICU exposes much of the underlying ICU C++ API directly."},"warnings":[{"fix":"Ensure ICU C++ libraries are installed via your system's package manager (e.g., `brew install icu4c`, `sudo apt-get install libicu-dev`) and that `pkg-config` can locate them or relevant environment variables are set.","message":"PyICU requires the underlying ICU C++ libraries to be installed on your system. `pip install pyicu` will only install the Python bindings; it does not install the native ICU libraries. Installation paths (e.g., `LD_LIBRARY_PATH`, `DYLD_LIBRARY_PATH`, `PATH`) or `pkg-config` setup might be necessary.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Consult the official ICU4C API reference (e.g., `https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/`) and infer the Python equivalents. Common patterns for string and date conversions are described in the PyICU README.","message":"PyICU's API closely mirrors the ICU4C C++ API, and there is no dedicated Python API documentation. Users must refer to the ICU4C C++ API documentation and translate patterns to Python.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Be aware of whether an ICU function expects an output `UnicodeString` to be modified in place or returns a new string. For non-UTF-8 encoded Python `str`, explicitly construct `UnicodeString(str, encodingName)`.","message":"Handling strings with PyICU can be nuanced due to the difference between ICU's mutable `UnicodeString` and Python's immutable `str` (or `unicode` in Python 2). ICU APIs may modify `UnicodeString` objects in place, while PyICU often overloads functions to accept Python `str` and convert to/from `UnicodeString` implicitly, assuming UTF-8 for `str` objects.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure your CFLAGS include the appropriate C++ standard flag (`-std=c++11` or `-std=c++17`) when building PyICU from source, or use pre-built binaries if available for your environment.","message":"Depending on the version of the ICU C++ library you are building against, specific C++ standard compiler flags are required. ICU versions 60-74 require `-std=c++11`, while ICU 75 and later require `-std=c++17`. Failure to include the correct flag can lead to build errors.","severity":"breaking","affected_versions":"PyICU versions built against ICU 60+"},{"fix":"Ensure `pkg-config` is installed and configured to find your ICU libraries. Verify with `pkg-config --cflags --libs icu-i18n`.","message":"The `icu-config` program for locating ICU libraries has been deprecated since ICU 63.1. `pkg-config` is now the recommended tool for this purpose.","severity":"deprecated","affected_versions":"PyICU versions built against ICU 63.1+"}],"env_vars":null,"last_verified":"2026-04-14T00:00:00.000Z","next_check":"2026-07-13T00:00:00.000Z"}