Language Data
language-data is a Python library that provides supplementary data about languages, primarily intended to be consumed by the `langcodes` module. It bundles data from the Unicode Common Locale Data Repository (CLDR). The current version is 1.4.0. Releases are irregular, often coinciding with updates to the CLDR specification, typically one to two times per year.
Warnings
- breaking Starting with version 1.4.0, support for Python 3.9 has been officially dropped. Users on Python 3.9 or older must remain on language-data < 1.4.0.
- gotcha This library primarily serves as a data package for `langcodes` and does not expose a public API for direct interaction. Most users should interact with language data through `langcodes` rather than attempting to import or process `language-data`'s internal structures directly.
- gotcha The data provided by `language-data` is updated based on new CLDR releases. This means that language properties, aliases, and official codes can subtly change between `language-data` versions. Always review the changelog for CLDR updates when upgrading to ensure compatibility with your application's assumptions.
Install
-
pip install language-data
Imports
- language_data
import langcodes # The language-data library provides data that langcodes automatically uses. # Direct imports from language_data are generally not needed for its primary use case.
Quickstart
# language-data is a data dependency for langcodes.
# After installing both 'langcodes' and 'language-data',
# langcodes will automatically utilize the data provided by language-data.
import langcodes
# Example: Get a Language object from langcodes
english_us = langcodes.Language.get('en-US')
print(f"Code: {english_us}")
print(f"English Name: {english_us.display_name('en')}")
# Example: Work with a less common language code
nl_be = langcodes.Language.get('nl-BE')
print(f"\nCode: {nl_be}")
print(f"French Name: {nl_be.display_name('fr')}")
print(f"Dutch Name: {nl_be.display_name('nl')}")
# The functionality of langcodes, such as display_name, depends on the data
# provided by the language-data package. If language-data were not installed
# or was an older version, langcodes might fall back to less detailed data
# or an internal, potentially outdated dataset.