langcodes

3.5.1 · active · verified Sun Apr 05

langcodes is a Python library (current version 3.5.1) that provides tools for parsing, manipulating, and comparing IETF language tags (BCP 47), which are used to identify human languages. It offers robust support for language identification and normalization, implementing standards like ISO 639 and Unicode CLDR. The library maintains an active release cadence, with minor versions released periodically.

Warnings

Install

Imports

Quickstart

Demonstrates parsing, normalizing, comparing language tags using distance, finding closest matches, and retrieving display names (with an optional dependency).

from langcodes import Language, standardize_tag, closest_match

# Parse a language tag
english_us = Language.get('en-US')
print(f"Parsed 'en-US': language={english_us.language}, script={english_us.script}, territory={english_us.territory}")

# Normalize a language tag
normalized_tag = standardize_tag('zh-CN')
print(f"Normalized 'zh-CN': {normalized_tag}")

# Compare languages using distance (lower is better match)
french = Language.get('fr')
canadian_french = Language.get('fr-CA')
print(f"Distance between 'fr' and 'fr-CA': {french.distance(canadian_french)}")

# Find the closest match from a list of supported languages
desired = 'en-GB'
supported = ['en-US', 'en-AU', 'fr-CA']
closest = closest_match(desired, supported)
print(f"Closest match for '{desired}' in {supported}: {closest}")

# Get display names (requires 'langcodes[data]' to be installed)
try:
    spanish_name_in_english = Language.get('es').display_name('en')
    print(f"Name of 'es' in English: {spanish_name_in_english}")
except ImportError:
    print("Install 'langcodes[data]' (e.g., pip install langcodes[data]) for language names and statistics.")

view raw JSON →