LZ4 Bindings for Python
The lz4 library provides Python bindings for the high-performance LZ4 compression algorithm by Yann Collet. It supports both the frame and block formats, with the frame format being recommended for most applications due to its interoperability. The library is actively maintained with frequent releases, currently at version 4.4.5, and offers a Pythonic API that can serve as a drop-in alternative to standard library compression modules like `zlib` or `gzip`.
Common errors
-
ModuleNotFoundError: No module named 'lz4.frame'
cause The 'lz4' Python package is either not installed, or there's an issue with the Python environment preventing the 'lz4.frame' submodule (or 'lz4.block' or the top-level 'lz4' module) from being found.fixEnsure the lz4 library is correctly installed in your environment: `pip install lz4` -
lz4.block.LZ4BlockError: Decompression failed: corrupt input or insufficient space in destination buffer. Error code: XXX
cause This error typically occurs during decompression when the input data is corrupted, the provided 'uncompressed_size' is incorrect or too small, or there's a mismatch between the compression and decompression methods (e.g., block vs. frame, or data compressed by a different LZ4 implementation).fixVerify the integrity of the compressed data. If using `lz4.block.decompress`, ensure the `uncompressed_size` argument is accurate, or use `lz4.frame.decompress` if the data was compressed using the frame format, which usually embeds size information. Example of handling unknown size: `while True: try: decompressed = lz4.block.decompress(compressed, uncompressed_size=usize) break except lz4.block.LZ4BlockError: usize *= 2` -
TypeError: a bytes-like object is required, not 'str'
cause The LZ4 compression and decompression functions operate on byte sequences, not standard Python strings. This error occurs when a plain string is passed as input without explicit encoding.fixEncode the Python string to bytes before compressing and decode the resulting bytes back to a string after decompressing. Example: `original_string.encode('utf-8')` before compression, and `decompressed_bytes.decode('utf-8')` after decompression. -
ModuleNotFoundError: No module named 'deprecation'
cause Older versions of the 'lz4' library might have a dependency on the 'deprecation' package which is not installed in the current Python environment.fixInstall the missing 'deprecation' package: `pip install deprecation`. Alternatively, upgrade 'lz4' to a newer version which might no longer have this dependency or handles it more robustly: `pip install --upgrade lz4`.
Warnings
- gotcha Input data for compression functions (`lz4.frame.compress`, `lz4.block.compress`) must be `bytes`. Passing a standard Python string directly will result in a `TypeError`. Always encode strings (e.g., `my_string.encode('utf-8')`) before compression.
- gotcha When using `lz4.block.decompress`, if the `uncompressed_size` parameter is not provided or is too small, a `lz4.block.LZ4BlockError` may be raised. It's often impossible to distinguish between insufficient buffer size and corrupt input data in this case. Consider setting an absolute upper bound (`max_size`) for memory allocation during decompression.
- gotcha The `lz4.frame` module is generally recommended for most applications as it defines a standard container format that ensures interoperability with other LZ4 implementations and language bindings. Using `lz4.block` for direct block compression without a container can lead to compatibility issues when exchanging data with other systems.
- deprecated The `lz4.stream` sub-package is considered experimental, unmaintained, and not built into distributed wheels. It requires manual compilation from source with specific environment variables. Avoid using it in production environments.
- gotcha The `compression_level` and `block_size` parameters significantly impact the trade-off between compression ratio and speed. While `lz4` is known for speed, higher `compression_level` values (e.g., 5 or 9) will yield smaller outputs but take longer. Default values might change or differ from other LZ4 implementations, potentially causing unexpected performance shifts.
- breaking Installing the `lz4` library requires a C compiler (e.g., `gcc`) to build its C extensions. This error typically occurs in minimal environments where development tools are not pre-installed.
Install
-
pip install lz4
Imports
- lz4.frame
import lz4.frame
- lz4.block
import lz4.block
Quickstart
import lz4.frame
import os
original_data = os.urandom(1024 * 1024) # 1 MB of random bytes
# Compress data using the LZ4 frame format
compressed_data = lz4.frame.compress(original_data)
# Decompress data
decompressed_data = lz4.frame.decompress(compressed_data)
# Verify integrity
assert original_data == decompressed_data
print(f"Original size: {len(original_data)} bytes")
print(f"Compressed size: {len(compressed_data)} bytes")
print("Data compressed and decompressed successfully.")