Fold-to-ASCII

1.0.2.post1 · active · verified Fri Apr 17

A Python port of the Apache Lucene ASCII Folding Filter, this library converts alphabetic, numeric, and symbolic Unicode characters outside the Basic Latin block into their ASCII equivalents if they exist. It's currently at version 1.0.2.post1 and is a stable, low-cadence utility library focused on character folding rather than full transliteration.

Common errors

Warnings

Install

Imports

Quickstart

Demonstrates the core functionality of `fold` for characters with and without direct ASCII equivalents, showing how non-folding characters are replaced with question marks.

from fold_to_ascii import fold

# Example 1: Characters with direct ASCII equivalents
text_with_accents = "Crème brûlée is delicious!"
ascii_text = fold(text_with_accents)
print(f"Original: {text_with_accents}")
print(f"Folded:   {ascii_text}")

# Example 2: Characters without direct ASCII equivalents
text_with_non_folding = "Hello, 你好 👋 world!"
ascii_text_non_folding = fold(text_with_non_folding)
print(f"Original: {text_with_non_folding}")
print(f"Folded:   {ascii_text_non_folding}")

view raw JSON →