{"id":9556,"library":"bnunicodenormalizer","title":"Bangla Unicode Normalizer","description":"bnunicodenormalizer (v0.1.7) is a Python library designed for normalizing Bangla Unicode text. It provides tools to clean and standardize Bangla text by addressing inconsistent character representations, digit forms, and other common challenges, making the text suitable for various Natural Language Processing (NLP) tasks. The library saw active development in mid-2023 and is currently in maintenance.","status":"maintenance","version":"0.1.7","language":"en","source_language":"en","source_url":"https://github.com/mnansary/bnUnicodeNormalizer","tags":["bangla","unicode","normalization","nlp","text-processing"],"install":[{"cmd":"pip install bnunicodenormalizer","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Logging functionality","package":"loguru"},{"reason":"Progress bars for operations","package":"tqdm"},{"reason":"Advanced regular expression matching","package":"regex"},{"reason":"Optional dependency for language detection functionality, can be heavy","package":"fasttext","optional":false}],"imports":[{"symbol":"Normalizer","correct":"from bnunicodenormalizer import Normalizer"}],"quickstart":{"code":"from bnunicodenormalizer import Normalizer\n\n# Initialize the normalizer. \n# By default, it attempts to load 'romanize_map.json' from its package directory.\nbn_normalize = Normalizer()\n\ntext_to_normalize = \"এই টেস্টিং টেক্সট।  ১০০ টাকা ।\"\nresult = bn_normalize(text_to_normalize)\n\nnormalized_text = result[\"normalized_text\"]\nprint(f\"Original: {text_to_normalize}\")\nprint(f\"Normalized: {normalized_text}\")\n\n# The result dictionary might also contain 'detected_lang' \n# if fasttext is enabled and detects it.\n# print(f\"Detected language: {result.get('detected_lang', 'N/A')}\")","lang":"python","description":"Demonstrates how to initialize the Normalizer and use it to process a simple Bangla text string. The default initialization attempts to load necessary mapping files from the installed package directory."},"warnings":[{"fix":"Ensure the provided `romanize_mapping_path` is an absolute and correct path to a valid JSON file, or omit the argument to use the default package-provided mapping.","message":"The `Normalizer` class can optionally take a `romanize_mapping_path` argument. If a custom path is provided and is incorrect or the file is missing, it will result in a `FileNotFoundError`. If not provided, it attempts to load a default file from the package installation directory.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Pin your dependency to an exact version (`bnunicodenormalizer==0.1.7`) in `requirements.txt` or `pyproject.toml` to prevent unexpected updates. Review release notes for new versions before upgrading.","message":"As a `0.x.x` version library, minor version increments (e.g., from 0.1.x to 0.2.x) can introduce breaking changes without adhering strictly to SemVer, though no explicit breaking changes are documented between recent `0.1.x` versions.","severity":"breaking","affected_versions":"All `0.x.x` versions"},{"fix":"If `fasttext` installation fails, consult the `fasttext` documentation for system-specific prerequisites (e.g., `build-essential` on Debian/Ubuntu, XCode command line tools on macOS). You may need to install it separately first (`pip install fasttext`).","message":"The library has a direct dependency on `fasttext`. Installing `fasttext` can sometimes be challenging due to its native dependencies (e.g., C++ compiler). If `fasttext` fails to install correctly, the language detection features of `bnunicodenormalizer` will be unavailable or may cause errors, even if normalization functions still work.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Ensure `bnunicodenormalizer` is properly installed via `pip install bnunicodenormalizer`. If providing a custom `romanize_mapping_path`, double-check the file existence and permissions for that path.","cause":"The `Normalizer` tried to load the default `romanize_map.json` but could not find it, often due to an incomplete installation or running from a non-standard environment.","error":"FileNotFoundError: [Errno 2] No such file or directory: '.../bnunicodenormalizer/romanize_map.json'"},{"fix":"Install the package using `pip install bnunicodenormalizer`. If using virtual environments, ensure your IDE or terminal is activated to the correct environment.","cause":"The `bnunicodenormalizer` package is not installed in the active Python environment.","error":"ModuleNotFoundError: No module named 'bnunicodenormalizer'"},{"fix":"Access dictionary keys using square bracket notation, e.g., `result['normalized_text']`. The returned dictionary structure is `{'normalized_text': '...', 'detected_lang': '...'}` (if language detection is enabled).","cause":"The output of `bn_normalize(text)` is a dictionary. You are attempting to access a key as an attribute.","error":"AttributeError: 'dict' object has no attribute 'normalized_text'"}]}