Arabic Reshaper
Arabic Reshaper is a Python library designed to reconstruct Arabic sentences for use in applications that do not natively support Arabic script rendering. It handles the complex shaping rules of Arabic, converting disjointed characters into their correctly connected forms (initial, medial, final, isolated) and managing ligatures. The library is actively maintained, with its latest major release being v3.0.0, and typically releases updates as needed.
Warnings
- breaking Python 2.x support has been completely dropped in version 3.0.0. Applications targeting Python 2 must use an older version (e.g., 2.x.x) or migrate to Python 3.
- gotcha Using `config_for_true_type_font` requires the `fonttools` package to be installed as an extra dependency (`pip install arabic-reshaper[with-fonttools]`). Without it, attempting to use this function will result in an `ImportError` or `ModuleNotFoundError`.
- gotcha Arabic Reshaper focuses solely on character shaping. For correct Right-to-Left (RTL) text display, especially when mixing Arabic and LTR languages, you will likely need to use a bidirectional (bidi) algorithm library (e.g., `python-bidi`) in conjunction with `arabic-reshaper`.
- gotcha By default, `delete_harakat` is `True`, meaning diacritics (tashkeel) are removed from the text during reshaping. If you need to preserve harakat, you must explicitly set `delete_harakat=False` in your `ArabicReshaper` instance configuration.
- gotcha Some fonts may be missing the isolated forms for certain Arabic letters, leading to incorrect rendering even after reshaping. The `use_unshaped_instead_of_isolated` configuration option can mitigate this by forcing the use of unshaped forms instead.
Install
-
pip install arabic-reshaper -
pip install --upgrade arabic-reshaper[with-fonttools]
Imports
- reshape
import arabic_reshaper reshaped_text = arabic_reshaper.reshape('text') - ArabicReshaper
from arabic_reshaper import ArabicReshaper reshaper = ArabicReshaper(configuration={'support_ligatures': True}) reshaped_text = reshaper.reshape('text') - config_for_true_type_font
from arabic_reshaper import ArabicReshaper, config_for_true_type_font, ENABLE_ALL_LIGATURES reshaper = ArabicReshaper(config_for_true_type_font('/path/to/font.ttf', ENABLE_ALL_LIGATURES))
Quickstart
import arabic_reshaper
# Basic reshaping
text_to_be_reshaped = 'اللغة العربية رائعة'
reshaped_text = arabic_reshaper.reshape(text_to_be_reshaped)
print(f"Original: {text_to_be_reshaped}")
print(f"Reshaped: {reshaped_text}")
# Reshaping with custom configuration
from arabic_reshaper import ArabicReshaper
configuration = {
'delete_harakat': False, # Keep diacritics
'support_ligatures': True,
}
reshaper = ArabicReshaper(configuration=configuration)
text_with_harakat = 'الْعَرَبيَّةُ'
reshaped_text_custom = reshaper.reshape(text_with_harakat)
print(f"Original (with harakat): {text_with_harakat}")
print(f"Reshaped (custom): {reshaped_text_custom}")