Web Encodings

0.5.1 verified Tue May 12 auth: no python install: verified quickstart: stale maintenance

webencodings is a Python library that implements the WHATWG Encoding Standard. It provides character encoding aliases and rules for handling legacy web content, such as US-ASCII and ISO-8859-1 mapping to Windows-1252, and byte order mark (BOM) detection. The current version is 0.5.1, and its release cadence is considered stalled.

pip install webencodings

Common errors

error ModuleNotFoundError: No module named 'webencodings' ↓

cause The 'webencodings' package is not installed in the current Python environment.

fix

pip install webencodings

error TypeError: webencodings.decode() missing 1 required positional argument: 'byte_string' ↓

cause The `webencodings.decode` function requires both an encoding name (first argument) and the byte string to be decoded (second argument).

fix

from webencodings import decode; decoded_text = decode('utf-8', b'some bytes')

error NameError: name 'lookup' is not defined ↓

cause The `lookup` function was called without being properly imported from the `webencodings` library or without qualifying it with `webencodings.`.

fix

from webencodings import lookup; encoding_obj = lookup(b'utf-8')

error AttributeError: 'NoneType' object has no attribute 'decode' ↓

cause `webencodings.lookup()` returned `None` because the specified encoding name was invalid or unrecognized, and a subsequent call to `.decode()` was attempted on this `None` object.

fix

from webencodings import lookup; encoding_obj = lookup(b'utf-8'); if encoding_obj: decoded = encoding_obj.decode(b'hello') else: print('Invalid encoding specified')

Warnings

gotcha The default error handling for `webencodings` decoding is 'replace', which replaces invalid bytes with the replacement character (U+FFFD). This differs from Python's standard library `codecs` module, which defaults to 'strict' and raises a `UnicodeDecodeError`. ↓

fix If strict error handling is desired, explicitly pass `errors='strict'` to the `decode` method: `encoding.decode(bytes_data, errors='strict')`.

gotcha The `webencodings.get_encoding()` method returns an `Encoding` object, which serves as a detection and mapping layer. The actual encoding and decoding operations are performed by Python's standard `codecs` module, which this `Encoding` object wraps. Consequently, the `Encoding` object itself does not expose `encode` or `decode` methods directly, leading to an `AttributeError` if attempted. ↓

fix Access the underlying `codecs.CodecInfo` object via the `codec_info` attribute of the `Encoding` object, or the encoding name string via `python_encoding`, and use its `encode` or `decode` methods. For example, instead of `encoding_obj.encode(...)`, use `encoding_obj.codec_info.encode(...)` or `encoding_obj.python_encoding.encode(...)`.

deprecated The library's development status on PyPI is '4 - Beta' and its release cadence is 'Stalled', with the last release in April 2017. While widely used, it indicates a lack of active development and may not receive updates for new encoding standards or Python versions. ↓

fix Monitor for actively maintained alternatives if long-term support and up-to-date standards compliance are critical for new projects. For existing projects, be aware of potential compatibility issues with newer Python versions, though it currently supports Python 2.6+ and 3.3+.

gotcha The `webencodings.Encoding` object provides a `decode` method directly, but it does not have an `encode` method. Attempting to call `encode()` on the `Encoding` object itself will result in an `AttributeError`. Encoding operations should be performed on the underlying `codecs` module object, which is accessible via the `codec` attribute of the `Encoding` instance. ↓

fix To encode text, use the `codec` attribute of the `Encoding` object: `encoded_bytes = encoding_obj.codec.encode(text_to_encode)`. For example, if you have `utf8_encoding = webencodings.lookup('utf-8')`, use `utf8_encoding.codec.encode(text_to_encode)`.

Install compatibility verified last tested: 2026-05-12

python os / libc status wheel install import disk

3.10 alpine (musl) wheel - 0.00s 17.9M

3.10 alpine (musl) - - 0.00s 17.9M

3.10 slim (glibc) wheel 1.4s 0.00s 18M

3.10 slim (glibc) - - 0.00s 18M

3.11 alpine (musl) wheel - 0.01s 19.7M

3.11 alpine (musl) - - 0.01s 19.7M

3.11 slim (glibc) wheel 1.6s 0.00s 20M

3.11 slim (glibc) - - 0.00s 20M

3.12 alpine (musl) wheel - 0.00s 11.6M

3.12 alpine (musl) - - 0.01s 11.6M

3.12 slim (glibc) wheel 1.5s 0.00s 12M

3.12 slim (glibc) - - 0.00s 12M

3.13 alpine (musl) wheel - 0.01s 11.3M

3.13 alpine (musl) - - 0.01s 11.2M

3.13 slim (glibc) wheel 1.5s 0.01s 12M

3.13 slim (glibc) - - 0.01s 12M

3.9 alpine (musl) wheel - 0.00s 17.4M

3.9 alpine (musl) - - 0.00s 17.4M

3.9 slim (glibc) wheel 1.8s 0.02s 18M

3.9 slim (glibc) - - 0.00s 18M

Imports

lookup
```
from webencodings import lookup
```
Used to find an encoding by its label according to the WHATWG standard.
decode
```
from webencodings.sync import decode
```
While `decode` and `encode` functions exist, they are often used via the `Encoding` object returned by `lookup` or directly if importing `sync`.
encode
```
from webencodings.sync import encode
```
While `decode` and `encode` functions exist, they are often used via the `Encoding` object returned by `lookup` or directly if importing `sync`.

Quickstart stale last tested: 2026-04-24

This quickstart demonstrates how to use `webencodings.lookup` to retrieve an `Encoding` object, and then use its `encode` and `decode` methods. It also highlights the default 'replace' error handling for decoding and how to explicitly use 'strict' handling.

from webencodings import lookup

# Look up an encoding by its label
utf8_encoding = lookup('utf-8')

if utf8_encoding:
    # Encode a string
    text_to_encode = "Hello, world!"
    encoded_bytes = utf8_encoding.encode(text_to_encode)
    print(f"Encoded bytes: {encoded_bytes}")

    # Decode bytes (with default error handling 'replace')
    bytes_to_decode = b'Hello, world!\xed'
    decoded_text = utf8_encoding.decode(bytes_to_decode)
    print(f"Decoded text (with replace): {decoded_text}")

    # Decode bytes with strict error handling
    try:
        decoded_text_strict = utf8_encoding.decode(bytes_to_decode, errors='strict')
        print(f"Decoded text (strict): {decoded_text_strict}")
    except UnicodeDecodeError as e:
        print(f"Strict decoding error: {e}")
else:
    print("UTF-8 encoding not found.")