compress-pickle
compress-pickle is a Python library (version 2.1.0) that thinly wraps the standard `pickle` package with various standard compression libraries (gzip, bz2, lzma, zipfile, and optionally lz4). It provides an interface similar to `pickle.dump`, `pickle.load`, `pickle.dumps`, and `pickle.loads` to seamlessly serialize and deserialize Python objects to disk or file-like objects in a compressed manner. The library has an infrequent release cadence, with its last major update in September 2021.
Warnings
- breaking Python 3.5 support was dropped starting from version 2.0.0. If you need to support Python 3.5, you must install `compress-pickle==1.1.1`.
- gotcha Like the underlying `pickle` module, `compress-pickle` is not secure against maliciously constructed data. Deserializing data from untrusted sources can lead to arbitrary code execution.
- gotcha For very small Python objects, the overhead introduced by compression might cause the compressed file size to be larger than the uncompressed pickled data.
- gotcha By default, `dump` and `load` infer the compression protocol from the file extension (e.g., '.pkl.gz' for gzip). If the file extension does not match the desired compression or is missing, explicitly set the `compression` argument (e.g., `compression='gzip'`).
- gotcha The `pickle` protocol used can affect compatibility. Objects pickled with newer Python versions or higher protocols may not be readable by older Python versions or lower protocols. `compress-pickle` inherits this behavior.
Install
-
pip install compress-pickle -
pip install compress-pickle[lz4]
Imports
- dump
from compress_pickle import dump
- load
from compress_pickle import load
- dumps
from compress_pickle import dumps
- loads
from compress_pickle import loads
Quickstart
from compress_pickle import dump, load
import os
data = {'key': 'value', 'numbers': [1, 2, 3]}
filename = 'my_compressed_data.pkl.gz'
# Dump (serialize and compress) the data
dump(data, filename)
print(f"Data saved to {filename} with inferred gzip compression.")
# Load (decompress and deserialize) the data
loaded_data = load(filename)
print(f"Data loaded successfully: {loaded_data}")
# Clean up the created file
os.remove(filename)