{"id":6906,"library":"tangled-up-in-unicode","title":"Tangled Up in Unicode","description":"Tangled Up in Unicode is a Python library, currently at version 0.2.0, that provides access to the Unicode Character Database (UCD). It serves as an alternative to Python's standard `unicodedata` module, offering the latest UCD versions and extended character properties. Releases are typically aligned with new Unicode Standard versions.","status":"active","version":"0.2.0","language":"en","source_language":"en","source_url":"https://github.com/dylan-profiler/tangled-up-in-unicode","tags":["unicode","character database","unicodedata","localization","internationalization"],"install":[{"cmd":"pip install tangled-up-in-unicode","lang":"bash","label":"Install latest version"}],"dependencies":[],"imports":[{"note":"Commonly imported as 'unicodedata' for compatibility and ease of use, replacing the standard library module with an updated and extended version.","symbol":"tangled_up_in_unicode","correct":"import tangled_up_in_unicode as unicodedata"}],"quickstart":{"code":"import tangled_up_in_unicode as unicodedata\n\nchar = '$'\nprint(f\"--- Properties for '{char}' ---\")\nprint(f\"Name: {unicodedata.name(char)}\")\nprint(f\"Category (Short): {unicodedata.category(char)}\")\nprint(f\"Bidirectional (Short): {unicodedata.bidirectional(char)}\")\n\n# This library provides more properties and aliases than standard unicodedata\nprint(f\"Script (Long): {unicodedata.script(char, long=True)}\")\nprint(f\"Block (Long): {unicodedata.block(char, long=True)}\")\nprint(f\"UCD Version: {unicodedata.unidata_version}\")","lang":"python","description":"Demonstrates how to import and use `tangled-up-in-unicode` to retrieve character properties, including extended ones and aliases not available in the standard `unicodedata` module, and to check the UCD version."},"warnings":[{"fix":"Be aware of disk space consumption when including this library, especially in constrained environments or when creating many virtual environments. Consider if the full UCD data is necessary or if a more lightweight approach is feasible for specific use cases. Future versions may address this by changing data storage formats.","message":"Installing `tangled-up-in-unicode` can result in a very large `site-packages` directory (up to 1.8GB or more in a `venv`) due to the size of generated `.pyc` files from large internal data dictionaries.","severity":"gotcha","affected_versions":"All versions up to 0.2.0"},{"fix":"If your application relies on catching `IndexError` for unknown scripts, update your code to expect 'Unknown' for versions 0.0.7 and later. For backward compatibility, check the library version or explicitly handle both `IndexError` (for older versions) and the 'Unknown' string (for newer versions).","message":"Prior to version 0.0.7, querying for a script that was not in the lookup table would raise an `IndexError`. From version 0.0.7 onwards, this behavior changed to return the string 'Unknown' instead.","severity":"gotcha","affected_versions":"< 0.0.7"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z","problems":[]}