zlib-state: Low-level Zlib Interface for Decoding State Capture

0.1.10 · active · verified Tue Apr 14

The zlib-state library provides a low-level Python interface to the zlib compression library, specifically enabling the capture and restoration of decompression states. This allows for advanced use cases such as resuming decompression from arbitrary points within gzip or raw deflate streams. It features `Decompressor` for byte-level control and `GzipStateFile` for file-like object interaction, and is actively maintained with support for recent Python versions up to 3.13.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `zlib-state.GzipStateFile` to decompress a gzipped file, capture the decoding state at a specific line, and then resume decompression from that exact point. This is particularly useful for random access or partial processing of large compressed files. A dummy `test_data.txt.gz` file is created and cleaned up by the example.

import zlib_state
import gzip
import os

# Create a dummy gzipped file for demonstration
dummy_content = b"Line 1\nLine 2\nLine 3 (State Capture Point)\nLine 4\n" * 50
with gzip.open("test_data.txt.gz", "wb") as f:
    f.write(dummy_content)

TARGET_LINE = 100
state_to_resume = None
position_to_resume = 0

try:
    # Use GzipStateFile to capture state at a specific point
    with zlib_state.GzipStateFile('test_data.txt.gz', keep_last_state=True) as f:
        for i, line in enumerate(f):
            if i == TARGET_LINE:
                state_to_resume = f.last_state
                position_to_resume = f.last_state_pos
                print(f"Captured state at line {i+1}, byte pos {position_to_resume}")
                break

    if state_to_resume and position_to_resume:
        print(f"\nResuming decompression from line {TARGET_LINE+1}...")
        with zlib_state.GzipStateFile('test_data.txt.gz') as f_resume:
            f_resume.zseek(position_to_resume, state_to_resume)
            remainder = f_resume.read(50) # Read a small portion after resuming
            print(f"Decompressed remainder (first 50 bytes): {remainder.decode('utf-8').strip()}...")

except Exception as e:
    print(f"An error occurred: {e}")
finally:
    # Clean up the dummy file
    if os.path.exists("test_data.txt.gz"):
        os.remove("test_data.txt.gz")

view raw JSON →