BSON Codec for Python
The `bson` library provides an independent codec for the BSON (Binary JSON) serialization format, without depending on MongoDB. It offers functions to efficiently serialize Python data structures into BSON bytes and deserialize BSON data back into Python objects. The current version is 0.5.10. Based on recent GitHub activity and a history of addressing issues, it appears to be actively maintained, though releases are not on a fixed schedule.
Warnings
- breaking Installing 'bson' from PyPI can cause conflicts and `ImportError` when PyMongo is also installed. PyMongo ships with its own `bson` package, and the independent 'bson' package from PyPI is incompatible.
- gotcha `bson.dumps()` only supports BSON-defined data types. Attempting to encode custom Python objects (e.g., instances of custom classes) directly will raise a `TypeError`.
- gotcha Decoding invalid, corrupted, or truncated BSON data will raise `bson.errors.InvalidBSON`.
- gotcha When encoding `datetime` objects, using `datetime.now()` without specifying a timezone can lead to ambiguity. It's best practice to use `datetime.utcnow()` for naive UTC timestamps or timezone-aware `datetime` objects.
- gotcha Release 0.4.5 was skipped due to a 'version issue with pypi'. While not directly impacting users, it indicates potential historical inconsistencies in the release process.
Install
-
pip install bson
Imports
- dumps
import bson; bson.dumps(...)
- loads
import bson; bson.loads(...)
- ObjectId
from bson.objectid import ObjectId
- InvalidBSON
from bson.errors import InvalidBSON
- BSON
import bson; bson.BSON(...)
- json_util
from bson import json_util
Quickstart
import bson
from bson.objectid import ObjectId
from datetime import datetime, timezone
# Create a Python dictionary with BSON-compatible types
data = {
"message": "Hello BSON!",
"timestamp": datetime.now(timezone.utc), # Use timezone-aware datetime for best practice
"_id": ObjectId(),
"value": 123.45,
"tags": ["python", "bson", "example"]
}
# Encode the dictionary to BSON bytes
bson_data = bson.dumps(data)
print(f"Encoded BSON (bytes): {bson_data}")
# Decode the BSON bytes back to a Python dictionary
decoded_data = bson.loads(bson_data)
print(f"Decoded data: {decoded_data}")
print(f"Type of decoded_data: {type(decoded_data)}")
print(f"Type of _id in decoded_data: {type(decoded_data['_id'])}")
# Demonstrate handling InvalidBSON
try:
bson.loads(b"invalid_bson_data")
except bson.errors.InvalidBSON as e:
print(f"Caught expected error: {e}")