RFC 8785 (JSON Canonicalization Scheme)
rfc8785.py is a pure-Python, no-dependency implementation of RFC 8785, also known as the JSON Canonicalization Scheme (JCS). It provides deterministic serialization of JSON data, essential for cryptographic operations like hashing and signing where byte-identical representations are required for logically equivalent data. The library is actively maintained, with version 0.1.4 being the latest stable release, and it follows an irregular release cadence driven by contributions and fixes.
Common errors
-
rfc8785.CanonicalizationError: Non-string object keys are not supported.
cause Attempting to serialize a dictionary with non-string keys (e.g., integers, tuples) directly.fixConvert all dictionary keys to strings before passing the data to `rfc8785.dumps` or `rfc8785.dump`. -
TypeError: Object of type X is not JSON serializable
cause Input data contains types not supported by standard JSON, or custom objects without a defined serialization method. This is a common Python JSON serialization error, also applicable here.fixEnsure all data types are standard JSON-compatible (strings, numbers, booleans, null, lists, dictionaries). Custom objects or unsupported types must be converted before canonicalization. -
rfc8785.IntegerDomainError: The given integer exceeds the true integer precision of an IEEE 754 double-precision float, which is what JSON uses.
cause An integer in the input data exceeds the maximum safe integer precision for an IEEE 754 double-precision float.fixFor very large integers that exceed IEEE 754 precision, convert them to strings in your input data before canonicalization, as per common JSON practices for large numbers.
Warnings
- breaking As of v0.1.3, the library explicitly raises `CanonicalizationError` when encountering non-string dictionary keys, rather than attempting implicit conversion or silent failure.
- gotcha All API functions (`dumps`, `dump`) produce UTF-8 encoded `bytes` objects or write to `bytes` I/O, not standard Python `str` objects. You must decode the output if a string is required.
- gotcha The library enforces strict number serialization according to ECMAScript's IEEE 754 double-precision float representation, raising `IntegerDomainError` or `FloatDomainError` for integers exceeding this precision or floats like NaN/Infinity.
- gotcha rfc8785.py explicitly does not support indentation or pretty-printing. The output is always minimally encoded as required by JCS.
Install
-
pip install rfc8785
Imports
- dumps
import rfc8785 canonical_json = rfc8785.dumps(...)
- dump
import rfc8785 with open('file.jcs', 'wb') as f: rfc8785.dump(my_data, f) - CanonicalizationError
from rfc8785 import CanonicalizationError try: ... except CanonicalizationError as e: ...
Quickstart
import rfc8785
data = {
"name": "Alice",
"age": 30,
"isStudent": False,
"courses": ["Math", "Science"],
"address": {
"city": "New York",
"zip": "10001"
}
}
canonical_bytes = rfc8785.dumps(data)
print(canonical_bytes.decode('utf-8'))