Bleach Allowlist
Bleach Allowlist provides curated lists of HTML tags, attributes, and CSS styles, designed for sanitizing user-provided HTML using the `bleach` library. It offers ready-to-use allowlists for common scenarios like Markdown rendering or printing, as well as comprehensive CSS properties. The current version is 1.0.3, released on August 13, 2020. This library has had a single stable release since its inception and primarily serves as a data provider for `bleach`.
Common errors
-
ModuleNotFoundError: No module named 'bleach_whitelist'
cause Attempting to import from the old, deprecated package name `bleach-whitelist`.fixThe package was renamed. Install `bleach-allowlist` and update your imports from `bleach_whitelist` to `bleach_allowlist`. -
TypeError: clean() got an unexpected keyword argument 'styles'
cause This error occurs when using `bleach.clean()` with the `styles` keyword argument in `bleach` versions 5.0.0 or later. The API changed to use a `css_sanitizer` object instead.fixUpgrade your `bleach.clean` call. Instead of passing `styles=...`, import `CSSSanitizer` from `bleach.css_sanitizer` and pass an instance: `css_sanitizer=CSSSanitizer(allowed_css_properties=your_style_list)`.
Warnings
- breaking The upstream `bleach` library, which `bleach-allowlist` depends on, has been deprecated since January 2023. While `bleach-allowlist` itself provides static lists and is not deprecated, its utility is tied directly to `bleach`.
- deprecated The project was originally released under the name `bleach-whitelist` and subsequently renamed to `bleach-allowlist`. The `bleach-whitelist` package is deprecated and will not receive updates.
- breaking The `bleach` library, particularly in versions 5.0.0 and above, introduced significant breaking changes related to CSS sanitization. The `styles` argument to `bleach.clean` was removed and replaced with a `css_sanitizer` argument that expects an instance of `bleach.css_sanitizer.CSSSanitizer`.
Install
-
pip install bleach-allowlist
Imports
- markdown_tags
from bleach_allowlist import markdown_tags
- markdown_attrs
from bleach_allowlist import markdown_attrs
- print_tags
from bleach_allowlist import print_tags
- print_attrs
from bleach_allowlist import print_attrs
- all_styles
from bleach_allowlist import all_styles
- standard_styles
from bleach_allowlist import standard_styles
Quickstart
import bleach
from bleach_allowlist import print_tags, print_attrs, all_styles
raw_html = '<h1>Hello <script>alert("XSS")</script>World!</h1><p style="color: red;">This is a paragraph.</p><a href="javascript:alert(1)">Click me</a>'
# Bleach requires the css_sanitizer for style attributes
# You might need to install 'bleach[css]' if you use advanced CSS sanitization.
# For simple use, passing all_styles works with default bleach setup for allowed styles.
# Note: bleach itself is deprecated, consider alternatives for new projects.
# If you are using bleach >= 5.0, CSS sanitization is significantly different.
# You would typically use bleach.css_sanitizer.CSSSanitizer
# For this example, assuming a version of bleach where `all_styles` can be passed directly
# or for illustration of values. Always refer to bleach's current documentation.
sanitized_html = bleach.clean(
raw_html,
tags=print_tags,
attributes=print_attrs,
styles=all_styles, # Note: For modern bleach, use css_sanitizer=CSSSanitizer(allowed_css_properties=all_styles)
strip=True
)
print(sanitized_html)