LZ4 Bindings for Python
The lz4 library provides Python bindings for the high-performance LZ4 compression algorithm by Yann Collet. It supports both the frame and block formats, with the frame format being recommended for most applications due to its interoperability. The library is actively maintained with frequent releases, currently at version 4.4.5, and offers a Pythonic API that can serve as a drop-in alternative to standard library compression modules like `zlib` or `gzip`.
Warnings
- gotcha Input data for compression functions (`lz4.frame.compress`, `lz4.block.compress`) must be `bytes`. Passing a standard Python string directly will result in a `TypeError`. Always encode strings (e.g., `my_string.encode('utf-8')`) before compression.
- gotcha When using `lz4.block.decompress`, if the `uncompressed_size` parameter is not provided or is too small, a `lz4.block.LZ4BlockError` may be raised. It's often impossible to distinguish between insufficient buffer size and corrupt input data in this case. Consider setting an absolute upper bound (`max_size`) for memory allocation during decompression.
- gotcha The `lz4.frame` module is generally recommended for most applications as it defines a standard container format that ensures interoperability with other LZ4 implementations and language bindings. Using `lz4.block` for direct block compression without a container can lead to compatibility issues when exchanging data with other systems.
- deprecated The `lz4.stream` sub-package is considered experimental, unmaintained, and not built into distributed wheels. It requires manual compilation from source with specific environment variables. Avoid using it in production environments.
- gotcha The `compression_level` and `block_size` parameters significantly impact the trade-off between compression ratio and speed. While `lz4` is known for speed, higher `compression_level` values (e.g., 5 or 9) will yield smaller outputs but take longer. Default values might change or differ from other LZ4 implementations, potentially causing unexpected performance shifts.
Install
-
pip install lz4
Imports
- lz4.frame
import lz4.frame
- lz4.block
import lz4.block
Quickstart
import lz4.frame
import os
original_data = os.urandom(1024 * 1024) # 1 MB of random bytes
# Compress data using the LZ4 frame format
compressed_data = lz4.frame.compress(original_data)
# Decompress data
decompressed_data = lz4.frame.decompress(compressed_data)
# Verify integrity
assert original_data == decompressed_data
print(f"Original size: {len(original_data)} bytes")
print(f"Compressed size: {len(compressed_data)} bytes")
print("Data compressed and decompressed successfully.")