Multivolume File Wrapper
multivolumefile is a Python library that provides a file-like object wrapper to automatically split large files into multiple smaller 'volumes' during writing, and merge them back seamlessly during reading. It handles the underlying file operations, creating numbered volume files (e.g., `filename.000`, `filename.001`). The current version is 0.2.3, and it has a moderate release cadence, with several minor updates addressing stability and feature enhancements.
Warnings
- gotcha When reading an existing multivolume file, you must open it using the *base filename* (e.g., 'my_file'). Do not attempt to open individual volume files (e.g., 'my_file.000', 'my_file.001') directly with `MultivolumeFile`, as this will not work as expected and might lead to `FileNotFoundError` or incorrect behavior as the library expects the base name to manage the set of volumes.
- gotcha The `volume_size` parameter is crucial when writing. If omitted, it defaults to a very large value (or no explicit split) effectively creating a single file. Forgetting to set an appropriate `volume_size` can lead to the library not splitting files as intended, or conversely, setting it too small can create an excessive number of tiny files.
- breaking Earlier versions (pre-0.1.4) had known issues with append mode (`'ab'`), potentially leading to incorrect data writing or file corruption. While fixed in later releases, relying on append mode in very old versions is unstable. It's generally safer to rewrite the entire multivolume file if data integrity is paramount, or carefully test append operations.
Install
-
pip install multivolumefile
Imports
- MultivolumeFile
from multivolume import MultivolumeFile
Quickstart
from multivolume import MultivolumeFile
import os
# Define a base filename for our multivolume file
base_filename = "my_test_multi_file"
# --- Writing a multivolume file ---
# The file will be split into volumes of 1024 bytes each
with MultivolumeFile(base_filename, "wb", volume_size=1024) as f:
f.write(b"Hello world. This is the first line.\n")
# This line will likely span across volumes or start a new one
f.write(b"This is a longer line that ensures multiple volumes are created if content is sufficient.\n")
f.write(b"End of content.\n")
print(f"Successfully wrote content to multivolume file(s) starting with '{base_filename}'.")
# --- Reading a multivolume file ---
# Open using the same base filename
with MultivolumeFile(base_filename, "rb") as f:
read_content = f.read()
print("\n--- Read content ---")
print(read_content.decode('utf-8'))
# --- Cleanup (optional) ---
# Remove the generated volume files
import glob
for f_path in glob.glob(f"{base_filename}.*"):
os.remove(f_path)
print(f"\nCleaned up files matching '{base_filename}.*'")