Partd
Partd is a Python library that provides appendable key-value storage, primarily for raw bytes. It excels at shuffling operations, allowing efficient appending of data to existing values associated with a key. The current version is 1.4.2, and it appears to have a stable, though not rapid, release cadence, with the latest update in May 2024.
Common errors
-
ModuleNotFoundError: No module named 'partd'
cause The 'partd' library is not installed in the current Python environment.fixpip install partd -
AttributeError: module 'partd' has no attribute 'File'
cause Attempting to access 'File' as a direct attribute or submodule of the 'partd' module, rather than importing it as a class using 'from partd import File'.fixfrom partd import File -
AttributeError: 'partd.File' object has no attribute 'read'
cause Attempting to call a non-existent method 'read' on a `partd.File` object. `partd.File` provides appendable key-value storage and does not expose a standard file-like 'read' method; instead, it uses methods like `get`.fixRefer to the `partd` documentation for correct usage, typically using `p.get(key)` to retrieve data from a `partd.File` object. -
TypeError: a bytes-like object is required, not 'str'
cause Partd stores raw bytes, and methods like `append` or `__setitem__` expect byte strings (e.g., `b'value'`) rather than regular Python strings.fixfrom partd import File; p = File(); p.append(b'mykey', b'myvalue') # or 'myvalue'.encode('utf-8') -
TypeError: Can't instantiate abstract class partd with abstract method __exit__
cause The `partd` (lowercase) object directly imported from the `partd` package is an abstract base class and cannot be instantiated directly.fixfrom partd import File; store = File() # Instantiate a concrete implementation like File, Buffer, or Dict
Warnings
- gotcha Partd's core functionality stores raw bytes. When working with Python objects (like lists, dictionaries, or custom classes), you must explicitly use an encoding layer (e.g., `partd.python.Python` for `pickle`/`msgpack` or `partd.numpy.Numpy` for arrays) or handle serialization yourself. Direct `append` expects `bytes`.
- gotcha For many small write operations, especially in parallel environments, the default file-based Partd implementations (`partd.file.File`) can be inefficient due to I/O overhead and locking. The documentation suggests that 'this is hard to do in parallel while also maintaining consistency' and recommends a centralized server solution for caching.
- gotcha For file-backed Partd instances (`partd.file.File`), it's crucial to explicitly call `.drop()` to clean up the created directories and files when they are no longer needed. Failing to do so can leave orphaned data on the filesystem.
- gotcha Using `partd.numpy.Numpy` functionality, or any test script that implicitly or explicitly imports `numpy` while interacting with `partd` components, requires the `numpy` package to be installed. If `numpy` is not present in your environment, attempts to import `numpy` will result in a `ModuleNotFoundError`.
- breaking The test failed because a required dependency, `numpy`, was not found. If your application or test suite relies on `numpy` (e.g., for `partd.numpy.Numpy`), ensure it is properly installed in the environment.
Install
-
pip install partd
Imports
- File
from partd.file import File
- Buffer
from partd.buffer import Buffer
- Python
from partd.python import Python
- Numpy
from partd.numpy import Numpy
Quickstart
import partd
import numpy as np
# Create a Partd backed by a directory (or in-memory buffer)
p = partd.File('my_partd_data') # or p = partd.Buffer()
# Append key-byte pairs
p.append({'x': b'Hello '})
p.append({'x': b'world!'})
p.append({'y': b'123'})
p.append({'y': b'456'})
# Get bytes associated to keys
print(f"Value for 'x': {p.get('x')}")
print(f"Value for 'y' and 'x': {p.get(['y', 'x'])}")
# Example with NumPy encoding
# Requires 'numpy' as an optional dependency
p_np = partd.numpy.Numpy(partd.File('my_numpy_data'))
p_np.append({'data': np.array([1, 2, 3])})
p_np.append({'data': np.array([4, 5, 6])})
print(f"NumPy array: {p_np.get('data')}")
# Clean up (for File-backed Partd)
p.drop()
p_np.drop()