Multivolume File Wrapper

0.2.3 · active · verified Thu Apr 09

multivolumefile is a Python library that provides a file-like object wrapper to automatically split large files into multiple smaller 'volumes' during writing, and merge them back seamlessly during reading. It handles the underlying file operations, creating numbered volume files (e.g., `filename.000`, `filename.001`). The current version is 0.2.3, and it has a moderate release cadence, with several minor updates addressing stability and feature enhancements.

Warnings

gotcha When reading an existing multivolume file, you must open it using the *base filename* (e.g., 'my_file'). Do not attempt to open individual volume files (e.g., 'my_file.000', 'my_file.001') directly with `MultivolumeFile`, as this will not work as expected and might lead to `FileNotFoundError` or incorrect behavior as the library expects the base name to manage the set of volumes.
Fix: Always use `MultivolumeFile("my_base_filename", "rb")` to read a multivolume set.
gotcha The `volume_size` parameter is crucial when writing. If omitted, it defaults to a very large value (or no explicit split) effectively creating a single file. Forgetting to set an appropriate `volume_size` can lead to the library not splitting files as intended, or conversely, setting it too small can create an excessive number of tiny files.
Fix: Explicitly define `volume_size` in the constructor, e.g., `MultivolumeFile('filename', 'wb', volume_size=1024 * 1024)` for 1MB volumes.
breaking Earlier versions (pre-0.1.4) had known issues with append mode (`'ab'`), potentially leading to incorrect data writing or file corruption. While fixed in later releases, relying on append mode in very old versions is unstable. It's generally safer to rewrite the entire multivolume file if data integrity is paramount, or carefully test append operations.
Fix: Upgrade to version 0.1.4 or newer for stable append mode behavior. When possible, prefer writing in 'wb' mode for fresh data or full overwrites.

Install

pip install multivolumefile Install stable version

Imports

MultivolumeFile
```
from multivolume import MultivolumeFile
```

Quickstart

This quickstart demonstrates how to write data to a `MultivolumeFile` that automatically splits content into specified `volume_size` chunks, and then how to read the entire content back seamlessly. It also includes cleanup for the generated volume files.

from multivolume import MultivolumeFile
import os

# Define a base filename for our multivolume file
base_filename = "my_test_multi_file"

# --- Writing a multivolume file ---
# The file will be split into volumes of 1024 bytes each
with MultivolumeFile(base_filename, "wb", volume_size=1024) as f:
    f.write(b"Hello world. This is the first line.\n")
    # This line will likely span across volumes or start a new one
    f.write(b"This is a longer line that ensures multiple volumes are created if content is sufficient.\n")
    f.write(b"End of content.\n")

print(f"Successfully wrote content to multivolume file(s) starting with '{base_filename}'.")

# --- Reading a multivolume file ---
# Open using the same base filename
with MultivolumeFile(base_filename, "rb") as f:
    read_content = f.read()
    print("\n--- Read content ---")
    print(read_content.decode('utf-8'))

# --- Cleanup (optional) ---
# Remove the generated volume files
import glob
for f_path in glob.glob(f"{base_filename}.*"):
    os.remove(f_path)
print(f"\nCleaned up files matching '{base_filename}.*'")

view raw JSON →