Bleach

raw JSON →
6.3.0 verified Tue May 12 auth: no python install: verified quickstart: verified deprecated

Bleach is an allowed-list-based HTML sanitizing library for Python (current version 6.3.0) that escapes or strips markup and attributes based on a configurable safelist. It also provides functionality to safely linkify text, including setting `rel` attributes. Designed for sanitizing text from untrusted sources, Bleach is built upon html5lib, making it robust against malformed HTML fragments. Note that the project was deprecated in January 2023, citing upstream dependency `html5lib`'s lack of active maintenance, and is now in a minimum-maintenance mode, with new projects discouraged.

pip install bleach
error ModuleNotFoundError: No module named 'bleach'
cause The 'bleach' module is not installed in the Python environment.
fix
Install the 'bleach' module using pip: 'pip install bleach'.
error ModuleNotFoundError: No module named 'bleach_whitelist'
cause The 'bleach_whitelist' module is not installed in the Python environment.
fix
Install the 'bleach_whitelist' module using pip: 'pip install bleach_whitelist'.
error AttributeError: module 'bleach_allowlist.bleach_allowlist' has no attribute 'all_styles'
cause The 'bleach_allowlist' module is either not installed correctly or is outdated.
fix
Reinstall the 'bleach_allowlist' module using pip: 'pip uninstall bleach_allowlist' followed by 'pip install bleach_allowlist'.
error AttributeError: module 'bleach' has no attribute 'clean'
cause This error often occurs when an incompatible version of bleach is installed, or when there's an incorrect import or usage pattern, as 'bleach.clean' is the standard function for sanitization.
fix
Ensure you are using a compatible version of bleach (e.g., current versions provide bleach.clean directly). Verify the import statement is import bleach and call bleach.clean(text).
error TypeError: argument cannot be of 'NoneType' type, must be of text type
cause The `bleach.clean()` or `bleach.linkify()` function was called with `None` as the input text argument, but it expects a string (text type).
fix
Ensure that the input text passed to bleach.clean() or bleach.linkify() is always a string and not None. You might need to add a check for None or provide a default empty string.
deprecated The Bleach project was officially deprecated on January 23, 2023, due to its reliance on the unmaintained `html5lib` library. It is now in a minimum-maintenance mode, and new projects are explicitly discouraged from using it.
fix Consider alternative HTML sanitization libraries, or fork/maintain `bleach` and `html5lib` at your own risk for existing projects. Do not use for new development.
breaking For `bleach.clean()`, `bleach.sanitizer.Cleaner`, `bleach.html5lib_shim.BleachHTMLParser`, `tags` and `protocols` arguments changed from lists to sets. Similarly, for `bleach.linkify()` and `bleach.linkifier.Linker`, `skip_tags` and `recognized_tags` arguments changed from lists to sets.
fix Update argument values from lists to Python `set` objects (e.g., `['b', 'i']` becomes `{'b', 'i'}`).
breaking CSS sanitization behavior within `style` attributes was completely rewritten. If you were sanitizing CSS, you will need to update your code. This functionality now requires installing `bleach` with the `[css]` extra: `pip install 'bleach[css]'`.
fix Install the `css` extra (`pip install 'bleach[css]'`) and review the updated documentation for CSS sanitization in `style` attributes.
breaking Attribute callables (functions passed to `attributes` argument) for `clean()` and `linkify()` changed their signature. They now expect three arguments: `tag`, `attribute_name`, and `attribute_value`, rather than just `attribute_name` and `attribute_value`.
fix Update custom attribute callable functions to accept the `tag` argument as the first parameter.
gotcha The output of `bleach.clean()` is intended for use specifically in an HTML *content* context (e.g., `<div>{{ cleaned_text }}</div>`). It is NOT safe for use in HTML attributes, CSS, JavaScript, JSON, or other contexts without further appropriate escaping (e.g., using a template engine's `escape` function).
fix Always pass `bleach.clean()` output through an additional escaping mechanism (like `django.utils.html.escape` or Jinja2's `escape`) if it's going into an HTML attribute or any non-HTML context.
breaking Bleach dropped support for older Python versions: 3.6 (v6.0.0), 3.7 (v6.1.0), 3.8 (v6.2.0), and 3.9 (v6.3.0). The current version (6.3.0) requires Python >=3.10.
fix Ensure your project runs on Python 3.10 or newer.
pip install 'bleach[css]'
python os / libc variant status wheel install import disk
3.10 alpine (musl) css - - 0.26s 19.6M
3.10 alpine (musl) bleach - - 0.25s 19.3M
3.10 slim (glibc) css - - 0.20s 20M
3.10 slim (glibc) bleach - - 0.19s 20M
3.11 alpine (musl) css - - 0.41s 21.7M
3.11 alpine (musl) bleach - - 0.42s 21.4M
3.11 slim (glibc) css - - 0.44s 22M
3.11 slim (glibc) bleach - - 0.32s 22M
3.12 alpine (musl) css - - 0.33s 13.5M
3.12 alpine (musl) bleach - - 0.31s 13.2M
3.12 slim (glibc) css - - 0.30s 14M
3.12 slim (glibc) bleach - - 0.33s 14M
3.13 alpine (musl) css - - 0.30s 13.1M
3.13 alpine (musl) bleach - - 0.30s 12.8M
3.13 slim (glibc) css - - 0.28s 14M
3.13 slim (glibc) bleach - - 0.29s 13M
3.9 alpine (musl) css - - 0.23s 19.0M
3.9 alpine (musl) bleach - - 0.23s 18.7M
3.9 slim (glibc) css - - 0.20s 19M
3.9 slim (glibc) bleach - - 0.20s 19M

This example demonstrates basic HTML sanitization using `bleach.clean()` and URL linkification with `bleach.linkify()`. It also shows how to use a `Cleaner` instance for more advanced or repeated sanitization tasks with custom allowed tags and attributes.

import bleach

# Sanitize HTML
html_input = 'An <script>alert("evil")</script> example with <b>bold</b> text.'
cleaned_html = bleach.clean(
    html_input,
    tags={'b', 'i', 'strong', 'em', 'a', 'p', 'br'},
    attributes={'a': ['href', 'title']}
)
print(f"Cleaned HTML: {cleaned_html}")

# Linkify text
text_with_urls = 'Check out example.com or mailto:user@example.com'
linkified_text = bleach.linkify(text_with_urls)
print(f"Linkified text: {linkified_text}")

# Using a Cleaner instance for performance/configurability
from bleach.sanitizer import Cleaner
my_cleaner = Cleaner(
    tags={'p', 'span'},
    attributes={'span': ['style']},
    css_sanitizer=None # Requires 'bleach[css]' for robust CSS sanitization
)
complex_html = '<p style="color: red;">Safe paragraph</p><img src="x.jpg">'
cleaned_complex = my_cleaner.clean(complex_html)
print(f"Cleaned with Cleaner: {cleaned_complex}")