Open Data Contract Standard (Python)
The `open-data-contract-standard` Python library provides a Pydantic model for reading and writing YAML files conforming to the Open Data Contract Standard (ODCS). It's extracted from the Data Contract CLI and its version number mirrors the major and minor versions of the ODCS it supports. The library is actively maintained, currently at version 3.1.2, supporting ODCS v3.1.0 and above.
Warnings
- gotcha The pip module version mirrors the major and minor versions of the Open Data Contract Standard (ODCS) it supports, but specifically does NOT mirror the patch version. For instance, ODCS v3.1.0 might correspond to pip module versions >=3.1.0 but not necessarily 3.1.0 exactly.
- breaking Migration from ODCS v2.x to v3.x involved significant breaking changes in the standard's schema, which the `open-data-contract-standard` Python library (v3.x and above) enforces. Key changes include renaming `uuid` to `id`, `columns` to `properties`, and changing the `team` (formerly `stakeholders`) structure from an array to an object.
- breaking ODCS v3.1.0 introduced stricter JSON Schema validation, disallowing additional or undefined properties in certain sections of the contract. While v3.1.0 was declared backward compatible with v3.0.x, contracts that previously contained extra, undefined fields (and were thus valid under a looser schema) will now fail validation.
- deprecated In ODCS v3.1.0, `slaDefaultElement` and the top-level `dataProduct` fields were deprecated. While still functional, they will generate warnings and are slated for removal in future major versions (e.g., ODCS v4).
Install
-
pip install open-data-contract-standard
Imports
- OpenDataContractStandard
from open_data_contract_standard.model import OpenDataContractStandard
Quickstart
from open_data_contract_standard.model import OpenDataContractStandard
# Example 1: Load a data contract specification from a string
data_contract_str = """
version: 1.0.0
kind: DataContract
id: 53581432-6c55-4ba2-a65f-72344a91553b
status: active
name: my_table
apiVersion: v3.1.0
"""
data_contract_from_string = OpenDataContractStandard.from_string(data_contract_str)
print("--- Data Contract from string ---")
print(data_contract_from_string.to_yaml())
# Example 2: To load from a file, you would use from_file
# Ensure 'data_contract.yaml' exists in the same directory
# with valid ODCS content for this to run without error.
# import os
# file_path = 'data_contract.yaml'
# with open(file_path, 'w') as f:
# f.write(data_contract_str)
# data_contract_from_file = OpenDataContractStandard.from_file(file_path)
# print("\n--- Data Contract from file ---")
# print(data_contract_from_file.to_yaml())
# os.remove(file_path) # Clean up file