xopen: Open Compressed Files Transparently
This Python module provides an `xopen` function that works like Python's built-in `open` function but also transparently deals with compressed files. `xopen` selects the most efficient method for reading or writing a compressed file, often leveraging external tools like `pigz` for parallel processing. It supports gzip (.gz), bzip2 (.bz2), xz (.xz), and optionally Zstandard (.zst) formats. The library is actively maintained, currently at version 2.0.2, and has a consistent release cadence.
Warnings
- breaking xopen v1.8.0 and later (including 2.x) dropped support for Python 3.7. Users on older Python versions must either upgrade Python or pin `xopen` to a version prior to 1.8.0.
- gotcha Unlike Python's built-in `open()`, `xopen()` defaults the `encoding` parameter to 'utf-8' when opening files in text mode ('rt', 'wt', 'at'). If your files are not UTF-8 encoded, this can lead to decoding/encoding errors.
- gotcha The `threads` parameter affects (de)compression performance and backend choice. `threads=None` (default) uses up to 4 CPU cores (potentially with external tools like `pigz`). `threads=0` forces single-threaded operation using only Python-based backends (`python-isal` or standard library modules). Behavior and performance will vary significantly.
- gotcha For gzip files, `compresslevel=0` (no compression) could previously cause crashes if the external `gzip` application backend was used, as it lacked a `--0` flag. `xopen` now attempts to defer to other backends in this scenario, but unexpected behavior might still occur with specific system configurations or older `xopen` versions.
Install
-
pip install xopen
Imports
- xopen
from xopen import xopen
Quickstart
from xopen import xopen
import os
# Create a dummy gzipped file
with open('example.txt', 'w') as f:
f.write('Hello, world!\nThis is a test.')
import gzip
with open('example.txt', 'rb') as f_in:
with gzip.open('example.txt.gz', 'wb') as f_out:
f_out.write(f_in.read())
os.remove('example.txt') # Clean up uncompressed file
# Open for reading (auto-detects gzip)
with xopen('example.txt.gz', mode='rt') as f:
content = f.read()
print(f'Read content: {content.strip()}')
# Open for writing (creates xz compressed file)
output_file = 'output.txt.xz'
with xopen(output_file, mode='wt', compresslevel=3) as f:
f.write('This is compressed with xz.\nAnother line.')
# Verify writing (optional: requires another xopen to read it back)
with xopen(output_file, mode='rt') as f:
written_content = f.read()
print(f'Written content to {output_file}: {written_content.strip()}')
# Clean up generated files
os.remove('example.txt.gz')
os.remove(output_file)