Partd

raw JSON →
1.4.2 verified Tue May 12 auth: no python install: verified

Partd is a Python library that provides appendable key-value storage, primarily for raw bytes. It excels at shuffling operations, allowing efficient appending of data to existing values associated with a key. The current version is 1.4.2, and it appears to have a stable, though not rapid, release cadence, with the latest update in May 2024.

pip install partd
error ModuleNotFoundError: No module named 'partd'
cause The 'partd' library is not installed in the current Python environment.
fix
pip install partd
error AttributeError: module 'partd' has no attribute 'File'
cause Attempting to access 'File' as a direct attribute or submodule of the 'partd' module, rather than importing it as a class using 'from partd import File'.
fix
from partd import File
error AttributeError: 'partd.File' object has no attribute 'read'
cause Attempting to call a non-existent method 'read' on a `partd.File` object. `partd.File` provides appendable key-value storage and does not expose a standard file-like 'read' method; instead, it uses methods like `get`.
fix
Refer to the partd documentation for correct usage, typically using p.get(key) to retrieve data from a partd.File object.
error TypeError: a bytes-like object is required, not 'str'
cause Partd stores raw bytes, and methods like `append` or `__setitem__` expect byte strings (e.g., `b'value'`) rather than regular Python strings.
fix
from partd import File; p = File(); p.append(b'mykey', b'myvalue') # or 'myvalue'.encode('utf-8')
error TypeError: Can't instantiate abstract class partd with abstract method __exit__
cause The `partd` (lowercase) object directly imported from the `partd` package is an abstract base class and cannot be instantiated directly.
fix
from partd import File; store = File() # Instantiate a concrete implementation like File, Buffer, or Dict
gotcha Partd's core functionality stores raw bytes. When working with Python objects (like lists, dictionaries, or custom classes), you must explicitly use an encoding layer (e.g., `partd.python.Python` for `pickle`/`msgpack` or `partd.numpy.Numpy` for arrays) or handle serialization yourself. Direct `append` expects `bytes`.
fix Use `from partd.python import Python` for general Python objects or `from partd.numpy import Numpy` for NumPy arrays, composing them with your chosen Partd implementation (e.g., `p = Python(File('my_python_data'))`).
gotcha For many small write operations, especially in parallel environments, the default file-based Partd implementations (`partd.file.File`) can be inefficient due to I/O overhead and locking. The documentation suggests that 'this is hard to do in parallel while also maintaining consistency' and recommends a centralized server solution for caching.
fix Consider using `partd.buffer.Buffer` for in-memory caching or explore `partd.zmq` (which requires `pyzmq`) for a centralized server solution when dealing with numerous small, concurrent writes to improve performance and consistency.
gotcha For file-backed Partd instances (`partd.file.File`), it's crucial to explicitly call `.drop()` to clean up the created directories and files when they are no longer needed. Failing to do so can leave orphaned data on the filesystem.
fix Always include `partd_instance.drop()` in your cleanup routine or context manager when using file-backed Partd implementations.
gotcha Using `partd.numpy.Numpy` functionality, or any test script that implicitly or explicitly imports `numpy` while interacting with `partd` components, requires the `numpy` package to be installed. If `numpy` is not present in your environment, attempts to import `numpy` will result in a `ModuleNotFoundError`.
fix Install `numpy` in your environment using `pip install numpy`. Ensure your test environment or application setup includes `numpy` as a dependency if you intend to use `partd.numpy` features.
breaking The test failed because a required dependency, `numpy`, was not found. If your application or test suite relies on `numpy` (e.g., for `partd.numpy.Numpy`), ensure it is properly installed in the environment.
fix Install `numpy` using `pip install numpy` in your environment before running the tests or application.
python os / libc status wheel install import disk
3.10 alpine (musl) wheel - 0.06s 18.6M
3.10 alpine (musl) - - 0.07s 18.6M
3.10 slim (glibc) wheel 1.6s 0.04s 19M
3.10 slim (glibc) - - 0.04s 19M
3.11 alpine (musl) wheel - 0.11s 20.6M
3.11 alpine (musl) - - 0.12s 20.6M
3.11 slim (glibc) wheel 1.7s 0.09s 21M
3.11 slim (glibc) - - 0.10s 21M
3.12 alpine (musl) wheel - 0.09s 12.5M
3.12 alpine (musl) - - 0.10s 12.5M
3.12 slim (glibc) wheel 1.6s 0.09s 13M
3.12 slim (glibc) - - 0.10s 13M
3.13 alpine (musl) wheel - 0.10s 12.2M
3.13 alpine (musl) - - 0.10s 12.1M
3.13 slim (glibc) wheel 1.6s 0.10s 13M
3.13 slim (glibc) - - 0.10s 13M
3.9 alpine (musl) wheel - 0.06s 18.1M
3.9 alpine (musl) - - 0.07s 18.1M
3.9 slim (glibc) wheel 1.8s 0.05s 19M
3.9 slim (glibc) - - 0.05s 19M

This quickstart demonstrates how to initialize a file-backed Partd instance, append byte data to keys, and retrieve the accumulated data. It also includes an example of using `partd.numpy.Numpy` to store and retrieve NumPy arrays, abstracting away the byte serialization. Remember to call `.drop()` to clean up file-backed Partd stores.

import partd
import numpy as np

# Create a Partd backed by a directory (or in-memory buffer)
p = partd.File('my_partd_data') # or p = partd.Buffer()

# Append key-byte pairs
p.append({'x': b'Hello '})
p.append({'x': b'world!'})
p.append({'y': b'123'})
p.append({'y': b'456'})

# Get bytes associated to keys
print(f"Value for 'x': {p.get('x')}")
print(f"Value for 'y' and 'x': {p.get(['y', 'x'])}")

# Example with NumPy encoding
# Requires 'numpy' as an optional dependency
p_np = partd.numpy.Numpy(partd.File('my_numpy_data'))
p_np.append({'data': np.array([1, 2, 3])})
p_np.append({'data': np.array([4, 5, 6])})
print(f"NumPy array: {p_np.get('data')}")

# Clean up (for File-backed Partd)
p.drop()
p_np.drop()