{"id":9801,"library":"gzip-stream","title":"GZIP Stream Compression","description":"gzip-stream is a lightweight Python library (version 1.2.0) for compressing and decompressing data on the fly using the GZIP format. It provides both synchronous and asynchronous stream interfaces, allowing efficient handling of large data without loading it all into memory. Releases are infrequent, often adding new features or compatibility updates rather than breaking changes.","status":"active","version":"1.2.0","language":"en","source_language":"en","source_url":"https://github.com/leenr/gzip-stream","tags":["gzip","compression","stream","asyncio","bytes"],"install":[{"cmd":"pip install gzip-stream","lang":"bash","label":"Install stable version"}],"dependencies":[],"imports":[{"symbol":"GZIPCompressedStream","correct":"from gzip_stream import GZIPCompressedStream"},{"symbol":"GZIPDecompressedStream","correct":"from gzip_stream import GZIPDecompressedStream"},{"note":"For asynchronous compression in asyncio applications.","symbol":"AsyncGZIPCompressedStream","correct":"from gzip_stream import AsyncGZIPCompressedStream"},{"note":"For asynchronous decompression in asyncio applications.","symbol":"AsyncGZIPDecompressedStream","correct":"from gzip_stream import AsyncGZIPDecompressedStream"}],"quickstart":{"code":"import gzip_stream\nimport io\n\n# Original data (must be bytes)\noriginal_data = b\"This is some data that will be compressed and then decompressed using gzip-stream.\" * 5 # Make it longer\n\n# --- Compression Example (Synchronous) ---\n# Create an iterable of bytes. In real apps, this could be reading from a file or network.\ndef generate_bytes_for_compression(data_bytes):\n    chunk_size = 15 # Arbitrary chunk size\n    for i in range(0, len(data_bytes), chunk_size):\n        yield data_bytes[i : i + chunk_size]\n\n# Compress the data by passing the iterable to GZIPCompressedStream\ncompressed_chunks = list(gzip_stream.GZIPCompressedStream(generate_bytes_for_compression(original_data)))\ncompressed_data = b\"\".join(compressed_chunks)\n\nprint(f\"Original size: {len(original_data)} bytes\")\nprint(f\"Compressed size: {len(compressed_data)} bytes\")\n\n# --- Decompression Example (Synchronous) ---\n# Create an iterable of compressed bytes.\ndef generate_bytes_for_decompression(compressed_data_bytes):\n    chunk_size = 20 # Arbitrary chunk size for decompression\n    for i in range(0, len(compressed_data_bytes), chunk_size):\n        yield compressed_data_bytes[i : i + chunk_size]\n\n# Decompress the data by passing the iterable to GZIPDecompressedStream\ndecompressed_chunks = list(gzip_stream.GZIPDecompressedStream(generate_bytes_for_decompression(compressed_data)))\ndecompressed_data = b\"\".join(decompressed_chunks)\n\nprint(f\"Decompressed data (first 50 bytes): {decompressed_data[:50].decode()}\")\n\n# Verify data integrity\nassert original_data == decompressed_data\nprint(\"\\nSuccess: Original and decompressed data match!\")","lang":"python","description":"This quickstart demonstrates synchronous compression and decompression of a byte stream using `gzip-stream`. It shows how to create an iterable of bytes, pass it to `GZIPCompressedStream`, and then take the resulting compressed chunks and pass them to `GZIPDecompressedStream` to retrieve the original data."},"warnings":[{"fix":"Ensure all input iterables yield `bytes` objects. Decode output chunks to `str` (e.g., `chunk.decode('utf-8')`) when necessary.","message":"The library exclusively handles byte streams. Providing strings directly to `GZIPCompressedStream` or expecting strings from `GZIPDecompressedStream` without explicit encoding/decoding will lead to `TypeError` or `UnicodeDecodeError`.","severity":"gotcha","affected_versions":"All"},{"fix":"Use `async for` with `AsyncGZIPCompressedStream` and `AsyncGZIPDecompressedStream` within an `async def` function, run with `asyncio.run()`. Use standard `for` with synchronous streams.","message":"Synchronous and asynchronous stream classes (e.g., `GZIPCompressedStream` vs `AsyncGZIPCompressedStream`) are distinct and not interchangeable. Attempting to use synchronous iteration (`for`) on an async stream will fail, and vice-versa.","severity":"gotcha","affected_versions":"All"},{"fix":"Always exhaust the stream by reading all chunks, for example, by converting it to a list (`list(stream)`) or iterating until completion (`for chunk in stream: ...`).","message":"When compressing, it's crucial to fully iterate through the `GZIPCompressedStream` (or `AsyncGZIPCompressedStream`) to ensure the GZIP footer, which contains checksums and length information, is properly written. If you stop iterating prematurely, the resulting compressed data may be truncated or corrupted.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Encode your string data to bytes before passing it to the stream. Example: `my_string.encode('utf-8')`.","cause":"Attempting to pass string data directly to `GZIPCompressedStream` or `GZIPDecompressedStream`, which expect `bytes`.","error":"TypeError: a bytes-like object is required, not 'str'"},{"fix":"Ensure you are using `async for chunk in stream:` within an `async def` function, and run the asynchronous code using `asyncio.run()`.","cause":"Trying to iterate an asynchronous stream object (`AsyncGZIPCompressedStream` or `AsyncGZIPDecompressedStream`) using a synchronous `for` loop.","error":"TypeError: 'AsyncGZIPCompressedStream' object is not async iterable"}]}