chardet: Universal Character Encoding Detector
chardet is a Python library that detects the character encoding of byte strings, providing the detected encoding, confidence score, and language. The current version is 7.4.0.post1, released on March 14, 2026. It is actively maintained with regular updates, focusing on improving accuracy and performance. The library requires Python 3.10 or higher and has zero runtime dependencies, making it suitable for various Python environments, including PyPy.
Warnings
- breaking chardet 7.0 is a ground-up, MIT-licensed rewrite — same package name, same public API — drop-in replacement for chardet 5.x/6.x. Python 3.10+, zero runtime dependencies, works on PyPy.
- deprecated chardet 7.0.0 introduced a new dense zlib-compressed model format (v2) that significantly reduces cold start times. Ensure your environment supports this format for optimal performance.
Install
-
pip install chardet
Imports
- detect
from chardet import detect
Quickstart
import chardet
# Detect encoding of a byte string
result = chardet.detect(b'Hello, world!')
print(result)
# Output: {'encoding': 'ascii', 'confidence': 1.0, 'language': 'en'}