{"id":242,"library":"charset-normalizer","title":"Charset Normalizer","description":"Charset-normalizer is a truly universal charset encoding detector for Python. It detects the encoding of raw bytes/files using a heuristic, non-training-based approach and can optionally identify the spoken language of the content. All IANA character set names supported by CPython codecs are supported. The library also ships a `normalizer` CLI tool and a drop-in `detect()` shim for Chardet migration. Current version is 3.4.6 (released March 2026); releases follow Semantic Versioning with frequent minor/patch cadence.","status":"active","version":"3.4.6","language":"python","source_language":"en","source_url":"https://github.com/jawah/charset_normalizer","tags":["encoding","charset","detection","chardet","unicode","text","normalization","i18n"],"install":[{"cmd":"pip install charset-normalizer","lang":"bash","label":"PyPI (pure Python)"},{"cmd":"pip install charset-normalizer -U","lang":"bash","label":"Upgrade to latest"}],"dependencies":[],"imports":[{"note":"Primary API for detecting encoding of a bytes/bytearray object. Returns a CharsetMatches container.","symbol":"from_bytes","correct":"from charset_normalizer import from_bytes"},{"note":"Primary API for detecting encoding of a file on disk; accepts str, bytes, or os.PathLike.","symbol":"from_path","correct":"from charset_normalizer import from_path"},{"note":"Primary API for detecting encoding from an already-open binary file pointer. Does NOT close the file pointer.","symbol":"from_fp","correct":"from charset_normalizer import from_fp"},{"note":"Legacy Chardet-compatible shim. Officially deprecated in favour of from_bytes; not planned for removal. Returns a dict with 'encoding', 'confidence', 'language'.","wrong":"import chardet; chardet.detect(...)","symbol":"detect","correct":"from charset_normalizer import detect"},{"note":"Utility to detect whether bytes/path/fp point to binary (non-text) content. Added in 3.3.x.","symbol":"is_binary","correct":"from charset_normalizer import is_binary"},{"note":"Class was renamed from CharsetNormalizerMatches to CharsetMatches in 3.0. The old alias was removed.","wrong":"from charset_normalizer import CharsetNormalizerMatches","symbol":"CharsetMatches","correct":"from charset_normalizer.models import CharsetMatches"},{"note":"Renamed from CharsetNormalizerMatch in 3.0. Old alias removed.","wrong":"from charset_normalizer import CharsetNormalizerMatch","symbol":"CharsetMatch","correct":"from charset_normalizer.models import CharsetMatch"}],"quickstart":{"code":"from charset_normalizer import from_bytes, from_path, detect\n\n# --- from raw bytes ---\nraw = b'\\xff\\xfe' + 'Hello, world!'.encode('utf-16-le')\nresults = from_bytes(raw)\nbest = results.best()\nif best is not None:\n    print('Encoding:', best.encoding)        # e.g. 'utf_16'\n    print('Language:', best.language)        # e.g. 'English' or ''\n    print('Decoded :', str(best))            # decoded unicode string\nelse:\n    print('Could not detect encoding (possibly binary data)')\n\n# --- from a file path ---\n# results2 = from_path('./data/sample.txt')\n# print(str(results2.best()))\n\n# --- Chardet-compatible legacy shim (deprecated but stable) ---\nresult = detect(raw)\nprint(result)  # {'encoding': 'UTF-16', 'confidence': 1.0, 'language': ''}\nif result['encoding']:\n    decoded = raw.decode(result['encoding'])\n    print('Legacy decoded:', decoded)\n","lang":"python","description":"Detect encoding of raw bytes, decode the content, and use the Chardet-compatible legacy shim."},"warnings":[{"fix":"Replace with CharsetMatch and CharsetMatches imported from charset_normalizer.models, or use the top-level from_bytes/from_path functions directly.","message":"Class aliases CharsetNormalizerMatch, CharsetNormalizerMatches, CharsetDetector, and CharsetDoctor were removed in 3.0. Code referencing these names will raise ImportError or AttributeError.","severity":"breaking","affected_versions":"<3.0"},{"fix":"Pin charset-normalizer<3.1 for Python 3.6, or upgrade the Python interpreter.","message":"Python 3.6 support was dropped in 3.1.0, and Python 3.5 support was dropped in 2.1.0. Installing 3.x on Python 3.6 is unsupported.","severity":"breaking","affected_versions":"<3.1 for Python 3.6; <2.1 for Python 3.5"},{"fix":"Migrate to from_bytes(...).best() for new code. Check best() for None before calling str() or accessing .encoding.","message":"detect() is the legacy Chardet-compatible shim and is officially deprecated. It also lowers confidence automatically for small byte samples (3.4.3+), so results on short inputs may differ from Chardet.","severity":"gotcha","affected_versions":">=3.0"},{"fix":"Always pass the full byte sequence. Do not slice input for 'performance' — the library already samples internally (5 blocks of 512 bytes by default).","message":"Feeding truncated or incomplete multi-byte byte sequences (e.g. a partial UTF-16 or UTF-32 file) will likely produce incorrect or empty detection results. The library is not designed for streaming partial payloads.","severity":"gotcha","affected_versions":"all"},{"fix":"Use: result = from_bytes(raw).best(); text = str(result) if result is not None else ''","message":"from_bytes/from_path return a CharsetMatches container, not a string or a single result. Calling str() directly on the container gives unexpected output. Always call .best() first, then check for None.","severity":"gotcha","affected_versions":"all"},{"fix":"Always use: from charset_normalizer import ...","message":"The import name uses an underscore (charset_normalizer) but the PyPI/install name uses a hyphen (charset-normalizer). Using import charset-normalizer raises a SyntaxError.","severity":"gotcha","affected_versions":"all"},{"fix":"Do not import internal modules. Use only the public API: from_bytes, from_path, from_fp, detect, is_binary.","message":"Internal module charset_normalizer.assets was moved into charset_normalizer.constant in 3.3.x. Any code importing from charset_normalizer.assets directly will break on 3.3+.","severity":"deprecated","affected_versions":">=3.3"}],"env_vars":null,"last_verified":"2026-05-12T12:17:48.698Z","next_check":"2026-06-25T00:00:00.000Z","problems":[{"fix":"Reinstall the package cleanly using `pip install --force-reinstall charset-normalizer` or, if using conda, `conda install -c conda-forge charset-normalizer` after uninstalling any existing version.","cause":"This error typically indicates a corrupted or incomplete installation of `charset-normalizer`, often due to file shadowing, stale `__pycache__` files, or issues within specific build environments like PyInstaller.","error":"AttributeError: partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (most likely due to a circular import)"},{"fix":"Install the package using `pip install charset-normalizer` or `conda install charset-normalizer` depending on your environment.","cause":"The `charset-normalizer` package is not installed in the active Python environment or is not discoverable in the Python path.","error":"ModuleNotFoundError: No module named 'charset_normalizer'"},{"fix":"Ensure `charset-normalizer` is installed in an environment whose scripts directory is in your system's PATH, or run the tool using `python -m charset_normalizer`.","cause":"The `normalizer` CLI tool, which comes with the `charset-normalizer` library, is not found in your system's PATH or was not installed correctly.","error":"normalizer: command not found"},{"fix":"Cleanly uninstall both `charset-normalizer` and any directly dependent libraries (like `chardet` if present), then reinstall `charset-normalizer` and the dependent libraries to ensure compatible versions are used.","cause":"This usually points to a version incompatibility or a corrupted installation, often occurring when `charset-normalizer` is used alongside other libraries (like `transformers` or `chardet`) that expect a different internal structure or version.","error":"ImportError: cannot import name 'COMMON_SAFE_ASCII_CHARACTERS' from 'charset_normalizer.constant'"}],"ecosystem":"pypi","meta_description":null,"install_score":100,"install_tag":"verified","quickstart_score":80,"quickstart_tag":"verified","pypi_latest":null,"install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.27,"mem_mb":3.7,"disk_size":"18.6M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.26,"mem_mb":3.7,"disk_size":"18.6M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.16,"mem_mb":3.7,"disk_size":"19M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.16,"mem_mb":3.7,"disk_size":"19M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.34,"mem_mb":4.2,"disk_size":"20.5M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.35,"mem_mb":4.2,"disk_size":"20.5M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.27,"mem_mb":4.2,"disk_size":"21M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.28,"mem_mb":4.2,"disk_size":"21M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.34,"mem_mb":4,"disk_size":"12.4M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.34,"mem_mb":4,"disk_size":"12.4M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.29,"mem_mb":4,"disk_size":"13M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.3,"mem_mb":4,"disk_size":"13M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.36,"mem_mb":4.3,"disk_size":"12.0M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.33,"mem_mb":4.3,"disk_size":"12.0M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.33,"mem_mb":4.1,"disk_size":"12M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.31,"mem_mb":4.1,"disk_size":"12M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.25,"mem_mb":3.6,"disk_size":"18.1M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.26,"mem_mb":3.6,"disk_size":"18.1M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.2,"mem_mb":3.6,"disk_size":"19M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.2,"mem_mb":3.6,"disk_size":"19M"}]},"quickstart_checks":{"last_tested":"2026-04-23","tag":"verified","tag_description":"quickstart runs on critical runtimes, recently tested","results":[{"runtime":"python:3.10-alpine","exit_code":0},{"runtime":"python:3.10-slim","exit_code":0},{"runtime":"python:3.11-alpine","exit_code":0},{"runtime":"python:3.11-slim","exit_code":0},{"runtime":"python:3.12-alpine","exit_code":0},{"runtime":"python:3.12-slim","exit_code":0},{"runtime":"python:3.13-alpine","exit_code":0},{"runtime":"python:3.13-slim","exit_code":0},{"runtime":"python:3.9-alpine","exit_code":0},{"runtime":"python:3.9-slim","exit_code":0}]}}