Zodbpickle
Zodbpickle is a fork of Python's built-in `pickle` module, primarily designed to provide a uniform pickling interface for ZODB (Zope Object Database). It extends Python 2.7's `pickle` and `cPickle` to support protocol 3 opcodes and introduces `zodbpickle.binary` for consistent binary value handling across Python 2 and 3. For Python 3, it forks the `pickle` module to re-add support for the `noload()` operation, which ZODB utilizes. The library is currently active, with version 4.3 as of the last check, and aims to ensure seamless data serialization and deserialization in ZODB environments spanning different Python versions.
Common errors
-
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x... in position ...: invalid start byte
cause Attempting to unpickle Python 2 `str` data (which could contain arbitrary bytes) in Python 3, where `str` objects are expected to be Unicode. This often occurs during ZODB database migrations.fixEnsure that Python 2 binary strings were stored using `zodbpickle.binary`. For existing Python 2 ZODB databases, use `zodbupdate --convert-py3 --encoding <your_encoding>` during migration to properly decode and convert string data. -
TypeError: can't concat str to bytes
cause This error typically arises in Python 3 code when attempting to concatenate `str` (Unicode) and `bytes` objects directly. It can be a symptom of incorrect handling of legacy Python 2 data in a ZODB migrated to Python 3, where `zodbpickle.binary` or `bytes` were not used consistently for binary data.fixExplicitly encode `str` to `bytes` (e.g., `s.encode('utf-8')`) or decode `bytes` to `str` (e.g., `b.decode('utf-8')`) as appropriate before concatenation. Review data handling logic, especially for values originally created in Python 2 or involving `zodbpickle.binary` to ensure type consistency.
Warnings
- breaking The `pickle` module, and by extension `zodbpickle`, is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source, as it could lead to arbitrary code execution.
- gotcha When migrating ZODB databases from Python 2 to Python 3, `Python 2 str` instances are by default loaded as `Python 3 str` (Unicode strings). If these `str` instances contained binary data, this can lead to `UnicodeDecodeError` or incorrect data interpretation. `zodbpickle.binary` was introduced to handle binary strings from Python 2 correctly as `bytes` in Python 3.
- gotcha While `zodbpickle` re-adds the `noload()` method (removed from standard Python 3 `pickle`) for ZODB compatibility, applications might encounter performance differences depending on the `pickle` protocol used. Python 3.4+ introduced `protocol 4` with significant performance impacts (e.g., framing), which standard `pickle` can leverage more directly than `zodbpickle`'s forks of earlier Python 3 `pickle` versions.
Install
-
pip install zodbpickle
Imports
- pickle
import pickle
from zodbpickle import pickle
- fastpickle
from zodbpickle import fastpickle
- slowpickle
from zodbpickle import slowpickle
Quickstart
from zodbpickle import pickle
class MyObject:
def __init__(self, name, value):
self.name = name
self.value = value
def __eq__(self, other):
if not isinstance(other, MyObject):
return NotImplemented
return self.name == other.name and self.value == other.value
# Pickle an object
obj = MyObject('example', 123)
pickled_obj = pickle.dumps(obj, protocol=pickle.HIGHEST_PROTOCOL)
print(f"Pickled object (bytes): {pickled_obj}")
# Unpickle the object
unpickled_obj = pickle.loads(pickled_obj)
print(f"Unpickled object: {unpickled_obj.name}, {unpickled_obj.value}")
assert obj == unpickled_obj