Data Contract Specification (Python Library)
The `datacontract-specification` Python library provides a Pydantic model for the Data Contract Specification. It allows users to define, load, and validate data contracts in a machine-readable YAML format, enabling programmatic interaction with data contract definitions. It's actively developed and frequently updated, mirroring the Data Contract Specification versions. The library serves as the programmatic core for tools like the `Data Contract CLI`.
Warnings
- breaking The underlying Data Contract Specification, for which this library provides a Pydantic model, is officially deprecated in favor of the Open Data Contract Standard (ODCS) as of ODCS v3.1. Users are recommended to migrate to ODCS.
- gotcha This `datacontract-specification` library provides only the Pydantic model of the Data Contract Specification. It does *not* include the CLI tools (like `lint`, `test`, `export`, `import`, `breaking`) or direct programmatic interaction with data sources. For command-line functionality and comprehensive data contract enforcement, use the `datacontract-cli` library.
- breaking Modifying data contract definitions (e.g., changing field types incompatibly, removing required fields, or making optional fields required) is a breaking change that can significantly affect downstream data consumers. Incompatible changes will lead to data pipeline failures and erode trust.
Install
-
pip install datacontract-specification
Imports
- DataContractSpecification
from datacontract_specification.model import DataContractSpecification
Quickstart
import os
from datacontract_specification.model import DataContractSpecification
# Create a dummy data contract YAML file for the example
dummy_contract_content = """
dataContractSpecification: 1.2.0
id: urn:datacontract:example:orders-latest
info:
title: Example Orders Latest
version: 1.0.0
description: |
Successful customer orders example.
owner: Example Team
status: active
models:
orders:
type: table
fields:
order_id:
type: string
required: true
description: Unique identifier for the order.
customer_id:
type: string
description: Identifier for the customer.
"""
file_path = "example_datacontract.yaml"
with open(file_path, "w") as f:
f.write(dummy_contract_content)
try:
# Load the data contract from the file
data_contract = DataContractSpecification.from_file(file_path)
print("Data Contract Loaded Successfully:")
print(data_contract.to_yaml())
# Access some properties
print(f"\nContract Title: {data_contract.info.title}")
print(f"Contract Version: {data_contract.info.version}")
except Exception as e:
print(f"An error occurred: {e}")
finally:
# Clean up the dummy file
if os.path.exists(file_path):
os.remove(file_path)