Msgpack-NumPy
This package provides encoding and decoding routines that enable the serialization and deserialization of numerical and array data types from NumPy using the highly efficient MessagePack format. It also supports serialization of Python's native complex data types. The current version is 0.4.8, and it maintains compatibility with Python versions 2.7 and 3.5+.
Warnings
- breaking When upgrading the underlying `msgpack` library from versions `0.4` or earlier to `0.5` or later, the package name on PyPI changed from `msgpack-python` to `msgpack`. Users must `pip uninstall msgpack-python` before `pip install -U msgpack` to prevent conflicts. This directly affects `msgpack-numpy` installations.
- gotcha NumPy arrays deserialized by `msgpack-numpy` are read-only views of the underlying data buffer to optimize memory usage. Attempting to modify them directly will result in an error.
- gotcha The primary design goal of `msgpack-numpy` is the preservation of numerical data types, which inherently adds some storage overhead to the serialized data (e.g., storing dtype, shape, kind).
- gotcha `msgpack` (and by extension `msgpack-numpy`) has limits on object sizes. For instance, the maximum length of a binary object is `(2^32)-1` bytes (approximately 4GB). Attempting to serialize NumPy arrays significantly larger than this limit can fail.
- gotcha NumPy arrays with `dtype='O'` (object type) are serialized/deserialized using Python's `pickle` module as a fallback within `msgpack-numpy`. This negates the efficiency benefits of `msgpack` and can introduce security risks if deserializing untrusted data.
Install
-
pip install msgpack-numpy
Imports
- msgpack_numpy
import msgpack_numpy as m
- patch
m.patch()
- encode
msgpack_numpy.encode
- decode
msgpack_numpy.decode
Quickstart
import msgpack
import msgpack_numpy as m
import numpy as np
# Easiest way: Monkey-patch msgpack to be numpy-aware
m.patch()
# Example NumPy array
data = {'array': np.array([1.2, 3.4, 5.6], dtype=np.float32), 'scalar': np.float64(7.8)}
# Serialize
packed_data = msgpack.packb(data, use_bin_type=True)
print(f"Packed size: {len(packed_data)} bytes")
# Deserialize
unpacked_data = msgpack.unpackb(packed_data, raw=False)
# Verify
print(f"Original array type: {type(data['array'])}, dtype: {data['array'].dtype}")
print(f"Unpacked array type: {type(unpacked_data[b'array'])}, dtype: {unpacked_data[b'array'].dtype}")
print(f"Original scalar type: {type(data['scalar'])}, value: {data['scalar']}")
print(f"Unpacked scalar type: {type(unpacked_data[b'scalar'])}, value: {unpacked_data[b'scalar']}")
# Ensure unpacked arrays are modifiable if needed (they are read-only by default)
original_array = unpacked_data[b'array'].copy()
original_array[0] = 99.9
print(f"Modified array: {original_array}")