uroman - Universal Romanizer

1.3.1.1 · active · verified Sun Apr 12

uroman is a universal romanizer designed to convert text in any script to the standard Latin alphabet. Version 1.3.1.1 is the current stable release. Starting with v1.3.1, the library underwent a significant rewrite from Perl to Python, bringing improved support for various languages including Coptic, Thai, Khmer, and Tibetan. Releases are made periodically to enhance language support and features.

Warnings

Install

Imports

Quickstart

Demonstrates how to import the uroman library and use the main `romanize` function to convert text from various scripts to the Latin alphabet.

import uroman

text_in_any_script = "你好世界"
romanized_text = uroman.romanize(text_in_any_script)

print(f"Original: {text_in_any_script}")
print(f"Romanized: {romanized_text}")

# Example with another script
text_arabic = "مرحبا بالعالم"
print(f"Arabic: {text_arabic} -> {uroman.romanize(text_arabic)}")

view raw JSON →