GZIP Stream Compression

1.2.0 · active · verified Fri Apr 17

gzip-stream is a lightweight Python library (version 1.2.0) for compressing and decompressing data on the fly using the GZIP format. It provides both synchronous and asynchronous stream interfaces, allowing efficient handling of large data without loading it all into memory. Releases are infrequent, often adding new features or compatibility updates rather than breaking changes.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates synchronous compression and decompression of a byte stream using `gzip-stream`. It shows how to create an iterable of bytes, pass it to `GZIPCompressedStream`, and then take the resulting compressed chunks and pass them to `GZIPDecompressedStream` to retrieve the original data.

import gzip_stream
import io

# Original data (must be bytes)
original_data = b"This is some data that will be compressed and then decompressed using gzip-stream." * 5 # Make it longer

# --- Compression Example (Synchronous) ---
# Create an iterable of bytes. In real apps, this could be reading from a file or network.
def generate_bytes_for_compression(data_bytes):
    chunk_size = 15 # Arbitrary chunk size
    for i in range(0, len(data_bytes), chunk_size):
        yield data_bytes[i : i + chunk_size]

# Compress the data by passing the iterable to GZIPCompressedStream
compressed_chunks = list(gzip_stream.GZIPCompressedStream(generate_bytes_for_compression(original_data)))
compressed_data = b"".join(compressed_chunks)

print(f"Original size: {len(original_data)} bytes")
print(f"Compressed size: {len(compressed_data)} bytes")

# --- Decompression Example (Synchronous) ---
# Create an iterable of compressed bytes.
def generate_bytes_for_decompression(compressed_data_bytes):
    chunk_size = 20 # Arbitrary chunk size for decompression
    for i in range(0, len(compressed_data_bytes), chunk_size):
        yield compressed_data_bytes[i : i + chunk_size]

# Decompress the data by passing the iterable to GZIPDecompressedStream
decompressed_chunks = list(gzip_stream.GZIPDecompressedStream(generate_bytes_for_decompression(compressed_data)))
decompressed_data = b"".join(decompressed_chunks)

print(f"Decompressed data (first 50 bytes): {decompressed_data[:50].decode()}")

# Verify data integrity
assert original_data == decompressed_data
print("\nSuccess: Original and decompressed data match!")

view raw JSON →