Canonical JSON
Canonical JSON is a Python library designed to produce a deterministic, byte-for-byte consistent JSON serialization of Python data structures. This is crucial for applications requiring cryptographic hashing or signatures where the exact byte representation of JSON data must be consistent across different environments and executions. The library is currently at version 2.0.0 and maintains an active development and release cadence.
Common errors
-
TypeError: Object of type frozendict is not JSON serializable
cause Attempting to serialize a `frozendict` instance directly with `canonicaljson` version 2.0.0 or later, without registering a custom serialization callback. Older versions of `canonicaljson` (pre-2.0.0) might have handled it, but v2.0.0 removed this built-in support.fixFor `canonicaljson >= 2.0.0`, register a preserialization callback: `from canonicaljson import register_preserialisation_callback; register_preserialisation_callback(frozendict, lambda obj: dict(obj))`. Alternatively, manually convert `frozendict` to `dict` before encoding. -
TypeError: Object of type <YourCustomType> is not JSON serializable
cause You are trying to serialize a custom Python object type that `canonicaljson` (and the underlying `json` module) doesn't know how to convert into a JSON primitive (like string, number, dict, list, boolean, null).fixDefine and register a `preserialisation_callback` for your custom type: `from canonicaljson import register_preserialisation_callback; class MyCustomClass: ...; def my_serializer(obj: MyCustomClass) -> dict: return {'_type': 'MyCustomClass', 'value': obj.some_attribute}; register_preserialisation_callback(MyCustomClass, my_serializer)`. -
ValueError: Invalid float value for JSON encoding: Infinity
cause You are attempting to serialize a Python `float('inf')`, `float('-inf')`, or `float('nan')` directly into JSON. These are not valid numeric values in the JSON specification (RFC 8259), and `canonicaljson` enforces this compliance, especially since v1.4.0.fixBefore passing your data to `canonicaljson.encode_canonical_json()`, replace these special float values with JSON-compatible alternatives, such as `None` (which serializes to `null`), or a descriptive string (e.g., `"Infinity"`, `"NaN"`). Example: `data = {k: None if isinstance(v, float) and (v == float('inf') or v == float('-inf') or v != v) else v for k, v in data.items()}`.
Warnings
- breaking Version 2.0.0 removed direct support for `simplejson` and the `set_json_library` function. The library now relies solely on the standard library's `json` module. Code that explicitly configured `simplejson` or relied on its default use will break.
- breaking Version 2.0.0 removed built-in support for serializing `frozendict` instances directly. If your data structures included `frozendict`, they will no longer be automatically converted to canonical JSON.
- breaking Version 1.4.0 introduced stricter compliance with RFC 7159 regarding floating-point numbers, specifically `Infinity`, `-Infinity`, and `NaN`. Previously, these might have been encoded in a non-standard way. Now, encoding objects containing these values directly will likely cause `TypeError` as they are not valid JSON numbers.
- gotcha While `canonicaljson` ensures consistent output, floating-point numbers in JSON are based on IEEE 754 double-precision, which can lead to precision loss for very large integers or specific decimal representations. This can potentially cause different systems to derive different canonical forms if they handle number precision differently before serialization.
Install
-
pip install canonicaljson
Imports
- encode_canonical_json
import canonicaljson.encode_canonical_json
from canonicaljson import encode_canonical_json
- register_preserialisation_callback
from canonicaljson import register_preserialisation_callback
Quickstart
import canonicaljson
from typing import Any, Dict
data = {"b": 2, "a": 1, "nested": {"y": 2, "x": 1}, "list": [3, 1, 2]}
canonical_bytes = canonicaljson.encode_canonical_json(data)
print(f"Canonical JSON: {canonical_bytes.decode('utf-8')}")
# Expected: {"a":1,"b":2,"list":[3,1,2],"nested":{"x":1,"y":2}}
# Example for custom types (v2.0.0+)
class CustomObject:
def __init__(self, value):
self.value = value
def custom_serializer(obj: CustomObject) -> Dict[str, Any]:
return {"custom_value": obj.value, "_type": "CustomObject"}
canonicaljson.register_preserialisation_callback(CustomObject, custom_serializer)
custom_data = {"item": CustomObject("hello")}
custom_canonical_bytes = canonicaljson.encode_canonical_json(custom_data)
print(f"Canonical JSON with Custom Object: {custom_canonical_bytes.decode('utf-8')}")
# Expected: {"item":{"_type":"CustomObject","custom_value":"hello"}}