cramjam

raw JSON →
2.11.0 verified Tue May 12 auth: no python install: verified

cramjam provides extremely thin and easy-to-install Python bindings to various de/compression algorithms implemented in Rust. It offers access to popular algorithms like Snappy, Brotli, Bzip2, Lz4, Gzip, Zlib, Deflate, ZSTD, and XZ/LZMA. The library, currently at version 2.11.0, is actively maintained with a regular release cadence.

pip install --upgrade cramjam
error ModuleNotFoundError: No module named 'cramjam'
cause The `cramjam` package is not installed in your current Python environment, or the environment where the code is being run does not have access to it.
fix
Install the package using pip: pip install cramjam
error error installing cramjam package
cause Installation issues can arise due to incompatible Python versions, especially in constrained environments like Docker, or if there are conflicts with other dependencies (e.g., `fastparquet` or `python-snappy`) that rely on `cramjam` and have specific environment requirements.
fix
Ensure you are using a Python version supported by the cramjam wheels (typically Python 3.8+). If in Docker, try changing the Python version in your Dockerfile. Sometimes, pre-installing related system libraries (like libsnappy-dev for python-snappy's transitive dependency on cramjam) can help, although cramjam itself aims to be dependency-free.
error Fatal Python error: Segmentation fault
cause This severe error can occur due to fundamental incompatibilities, often when `cramjam` is used with very new or pre-release Python versions, leading to memory access violations within the underlying Rust bindings.
fix
Downgrade to a stable and widely supported Python version (e.g., Python 3.9, 3.10, 3.11) that has pre-built cramjam wheels, or check the cramjam GitHub issues for compatibility with the specific Python version you are using.
error cramjam.DecompressionError: Invalid input or corrupted data
cause This error indicates that the data provided to a `cramjam` decompression function is not in the expected compressed format, is corrupted, or was compressed with a different algorithm than the one being used for decompression.
fix
Verify that the input data is indeed compressed with the cramjam algorithm you are attempting to use (e.g., cramjam.snappy.decompress for Snappy-compressed data) and that the data itself is not truncated or corrupted. Ensure the correct bytes-like object is passed.
gotcha When integrating `cramjam`'s LZ4 implementation or migrating from other LZ4 libraries, be aware that the default compression level in `cramjam`'s LZ4 might differ. For example, a change from level 0 to 9 in a dependent library using `cramjam`'s LZ4 was observed to cause significant performance degradation (over 100% latency increase) if not explicitly set. Always verify the `compression_level` if performance is critical.
fix Explicitly set the `compression_level` parameter when calling `cramjam.lz4.compress` or ensure any integrating library configures it appropriately. The default behavior may vary and impact performance.
gotcha For significant performance improvements (1.5-3x speedup), especially when decompressing or compressing to a standard `bytes` or `bytearray` object, provide the `output_len` argument if the exact output length is known beforehand. This allows for single buffer allocation. This optimization is less relevant when using `cramjam.Buffer` or `cramjam.File` objects.
fix Pass the `output_len=some_integer` argument to `compress` or `decompress` functions when the final size is predictable.
gotcha Some compression algorithms (e.g., Blosc2, ISA-L backends like igzip, ideflate, izlib) are considered experimental. These typically require building `cramjam` from source with specific feature flags enabled. They are not available out-of-the-box with a standard `pip install`.
fix Refer to the official `cramjam` GitHub repository or documentation for instructions on building from source with experimental features if you need to use them.
breaking The test script or some functionalities within `cramjam` rely on `numpy`, which is not installed by default in the test environment. This results in a `ModuleNotFoundError` if `numpy` is not available.
fix Ensure `numpy` is installed in your environment by running `pip install numpy`.
breaking When running `cramjam` on minimal Linux distributions like Alpine, especially with Python versions that might not have readily available pre-built wheels, core compression/decompression functionalities (e.g., Snappy, Brotli) might fail to correctly decompress data, resulting in `Decompressed data matches original: False`. This typically occurs if `cramjam`'s Rust extensions are not compiled against the correct system libraries, or if necessary build dependencies are missing during installation. Always verify `cramjam`'s functionality on Alpine if you are using it.
fix Ensure that `cramjam` is installed from a compatible wheel for your Alpine Python environment, or install necessary build tools (e.g., `rust`, `cargo`, `musl-dev`, `python3-dev`, `build-base`) before installing `cramjam` to allow it to compile correctly from source. Test core compression/decompression functionality after installation.
python os / libc status wheel install import disk
3.10 alpine (musl) wheel - 0.00s 22.6M
3.10 alpine (musl) - - 0.00s 22.6M
3.10 slim (glibc) wheel 1.8s 0.00s 23M
3.10 slim (glibc) - - 0.00s 23M
3.11 alpine (musl) wheel - 0.00s 24.4M
3.11 alpine (musl) - - 0.00s 24.4M
3.11 slim (glibc) wheel 1.8s 0.00s 24M
3.11 slim (glibc) - - 0.00s 24M
3.12 alpine (musl) wheel - 0.00s 16.3M
3.12 alpine (musl) - - 0.00s 16.3M
3.12 slim (glibc) wheel 1.6s 0.00s 16M
3.12 slim (glibc) - - 0.00s 16M
3.13 alpine (musl) wheel - 0.00s 16.0M
3.13 alpine (musl) - - 0.00s 15.9M
3.13 slim (glibc) wheel 1.7s 0.00s 16M
3.13 slim (glibc) - - 0.00s 16M
3.9 alpine (musl) wheel - 0.00s 22.1M
3.9 alpine (musl) - - 0.00s 22.1M
3.9 slim (glibc) wheel 2.2s 0.00s 22M
3.9 slim (glibc) - - 0.00s 22M

This quickstart demonstrates basic compression and decompression using Snappy and Brotli. It also includes an example of using `cramjam.Buffer` for more efficient in-place compression operations, which is beneficial when dealing with `bytes`, `bytearray`, or `numpy.array` objects.

import cramjam

original_data = b"This is some data to compress using cramjam!"

# Compress data using Snappy
compressed_data = cramjam.snappy.compress(original_data)
print(f"Original size: {len(original_data)} bytes")
print(f"Compressed size (Snappy): {len(compressed_data)} bytes")

# Decompress data
decompressed_data = cramjam.snappy.decompress(compressed_data)
print(f"Decompressed data matches original: {original_data == decompressed_data}")

# Example with Brotli
compressed_brotli = cramjam.brotli.compress(original_data)
print(f"Compressed size (Brotli): {len(compressed_brotli)} bytes")
decompressed_brotli = cramjam.brotli.decompress(compressed_brotli)
print(f"Decompressed Brotli matches original: {original_data == decompressed_brotli}")

# Using cramjam.Buffer for in-place operations
from cramjam import snappy, Buffer
import numpy as np

data_np = np.frombuffer(b'some bytes here for buffer', dtype=np.uint8)
output_buffer = Buffer()
snappy.compress_into(data_np, output_buffer)

output_buffer.seek(0) # Reset buffer position for reading
decompressed_buffer_data = snappy.decompress(output_buffer)
print(f"Buffer decompressed data matches original: {bytes(data_np) == bytes(decompressed_buffer_data)}")