Brotli Compression Library
Python bindings for the Brotli compression library. It offers high-performance lossless compression, often achieving better ratios than gzip, particularly for text-based data. The library's releases are typically synchronized with updates to the underlying C Brotli reference implementation.
Warnings
- breaking Older versions (<= 1.1.0) are vulnerable to Denial of Service (DoS) attacks via 'decompression bombs' (e.g., highly compressed zero-filled data expanding to excessive sizes), leading to memory exhaustion. Version 1.2.0 introduces `Decompressor::can_accept_more_data` and an optional `output_buffer_limit` argument to `Decompressor::process` to mitigate this.
- gotcha Versions of brotli prior to v1.0.9 were vulnerable to an integer overflow in the decoder when processing input chunks larger than 2GiB (CVE-2020-8927).
- gotcha Attempting to decompress data that was not compressed with Brotli will raise a `brotli.error` exception. Ensure the input data is valid Brotli compressed data.
- gotcha When using `brotli.Compressor` for streaming compression, it's crucial to call the `.finish()` method at the end of the data stream. Failing to do so will result in an incomplete or corrupted compressed output, as buffered data may not be flushed.
- gotcha When interacting with files, Brotli operates on raw bytes. Files must be opened in binary mode (`'rb'` for reading, `'wb'` for writing) to avoid `TypeError` or unexpected corruption.
- gotcha HTTP client libraries like `requests` and `urllib3` automatically detect and decompress Brotli (and other) encodings if the `brotli` library is installed. If you attempt to manually decompress content that has already been automatically decompressed, it will result in `brotli.error`.
Install
-
pip install brotli
Imports
- brotli
import brotli
- compress
brotli.compress(data)
- decompress
brotli.decompress(compressed_data)
- Compressor
compressor = brotli.Compressor()
- Decompressor
decompressor = brotli.Decompressor()
- error
try: ... except brotli.error: ...
Quickstart
import brotli
# Example data (must be bytes)
original_data = b"This is some sample text data that we want to compress. It's repetitive and will benefit from Brotli's algorithm."
# Compress data with default quality (11, highest compression)
# quality can range from 0 (fastest) to 11 (highest compression ratio)
compressed_data = brotli.compress(original_data, quality=5)
print(f"Original size: {len(original_data)} bytes")
print(f"Compressed size: {len(compressed_data)} bytes")
# Decompress data
decompressed_data = brotli.decompress(compressed_data)
print(f"Decompressed size: {len(decompressed_data)} bytes")
print(f"Data matches original: {original_data == decompressed_data}")
# Example of streaming compression
compressor = brotli.Compressor(quality=4)
streaming_compressed = b''
for i in range(3):
chunk = b'chunk ' + str(i).encode() + b' of data\n'
streaming_compressed += compressor.process(chunk)
streaming_compressed += compressor.finish()
print(f"\nStreaming compressed size: {len(streaming_compressed)} bytes")
# Example of streaming decompression (with output_buffer_limit for safety)
decompressor = brotli.Decompressor()
streaming_decompressed = b''
max_output_len = 1024 # Set a reasonable limit
try:
# process might raise brotli.error if output_buffer_limit is exceeded
for chunk in [streaming_compressed]: # In real usage, this would be chunks of compressed data
streaming_decompressed += decompressor.process(chunk, output_buffer_limit=max_output_len)
streaming_decompressed += decompressor.finish()
print(f"Streaming decompressed size: {len(streaming_decompressed)} bytes")
except brotli.error as e:
print(f"Decompression error: {e}")
print("Output buffer limit likely exceeded or corrupted data.")