{"id":4494,"library":"datacontract-cli","title":"Data Contract CLI","description":"The datacontract CLI is an open-source command-line tool (current version 0.11.8) for working with Data Contracts. It natively supports the Open Data Contract Standard (ODCS) to lint data contracts, connect to data sources, execute schema and quality tests, detect breaking changes, and export to different formats. Written in Python, it can be used as a standalone CLI tool, in CI/CD pipelines, or directly as a Python library. The project is actively maintained with frequent releases.","status":"active","version":"0.11.8","language":"en","source_language":"en","source_url":"https://github.com/datacontract/datacontract-cli","tags":["data contracts","cli","data quality","schema validation","data governance","etl","data engineering"],"install":[{"cmd":"pip install datacontract-cli","lang":"bash","label":"Basic Installation"},{"cmd":"pip install 'datacontract-cli[all]'","lang":"bash","label":"Installation with all data source connectors"}],"dependencies":[{"reason":"Used internally for data quality testing.","package":"soda-core","optional":true},{"reason":"Used internally for schema validation.","package":"fastjsonschema","optional":true},{"reason":"Used internally for native connections and testing, often with a version restriction.","package":"duckdb","optional":true},{"reason":"Numerous optional extras exist for specific database connectors (e.g., athena, bigquery, snowflake, postgres, s3). The '[all]' extra installs all of them.","package":"datacontract-cli[<db_type>]","optional":true}],"imports":[{"symbol":"DataContract","correct":"from datacontract.data_contract import DataContract"}],"quickstart":{"code":"import os\nfrom datacontract.data_contract import DataContract\n\n# Simulate a datacontract.yaml file content\ndatacontract_yaml_content = '''\ndataContractSpecification: 1.2.0\nid: urn:datacontract:example:test-contract\ninfo:\n  title: Example Test Contract\n  version: 1.0.0\n  owner: Data Team\nservers:\n  local_file:\n    type: local\n    path: ./data/{model}.csv\n    format: csv\n    delimiter: ','\nmodels:\n  my_data:\n    description: A simple dataset for testing.\n    fields:\n      id:\n        type: string\n        primaryKey: true\n      name:\n        type: string\n      value:\n        type: integer\n    quality:\n      - type: sql\n        description: 'All values should be positive.'\n        query: |\n          SELECT *\n          FROM my_data\n          WHERE value <= 0\n'''\n\n# Create a dummy data file for the test\nwith open('data_my_data.csv', 'w') as f:\n    f.write('id,name,value\\n')\n    f.write('1,Alice,10\\n')\n    f.write('2,Bob,20\\n')\n\n# Write the data contract to a temporary file\nwith open('datacontract.yaml', 'w') as f:\n    f.write(datacontract_yaml_content)\n\n# Environment variables for credentials are often required for real data sources.\n# For local testing, they might not be strictly needed depending on the 'server' configuration.\n# os.environ['DATACONTRACT_S3_ACCESS_KEY_ID'] = os.environ.get('DATACONTRACT_S3_ACCESS_KEY_ID', '')\n# os.environ['DATACONTRACT_S3_SECRET_ACCESS_KEY'] = os.environ.get('DATACONTRACT_S3_SECRET_ACCESS_KEY', '')\n\ntry:\n    data_contract = DataContract(data_contract_file=\"datacontract.yaml\")\n    run_results = data_contract.test()\n\n    if run_results.has_passed():\n        print(\"Data contract tests passed successfully.\")\n    else:\n        print(\"Data contract tests failed.\")\n    print(run_results.to_json())\n\nexcept Exception as e:\n    print(f\"An error occurred: {e}\")\nfinally:\n    # Clean up dummy files\n    os.remove('datacontract.yaml')\n    os.remove('data_my_data.csv')\n","lang":"python","description":"This quickstart demonstrates how to use `datacontract-cli` as a Python library to load a data contract from a YAML file and execute its defined schema and quality tests. It simulates a simple local data source. For actual data sources like S3 or BigQuery, ensure relevant environment variables are set for credentials, as shown in comments."},"warnings":[{"fix":"Migrate your usage to the Python CLI or library. If absolutely necessary to use the Go version, find its forked repository. Review the migration guide for specific syntax changes.","message":"The project migrated from Go to Python, introducing breaking changes for users relying on the Go CLI. The Go version has been forked and is no longer actively developed by the main project. Users previously relying on the Go version for programmatic use need to switch to the Python library.","severity":"breaking","affected_versions":"Before 0.10.x (Go versions) to 0.10.x+ (Python versions)"},{"fix":"Migrate existing data contract YAML files to the Open Data Contract Standard (ODCS). A migration instruction is available in the documentation. Be aware of unsupported features like internal `$ref` definitions or lineage.","message":"The internal data model transitioned from 'Data Contract Specification' to 'Open Data Contract Standard (ODCS) v3.1.0' as the default. This is a major change, and not all features of the old specification are supported in ODCS. The Data Contract Specification is now deprecated.","severity":"breaking","affected_versions":"0.11.0 and later (existing Data Contract Specification files are supported in 0.11.x until end of 2026)"},{"fix":"Always use environment variables or a secure secret management system for sensitive credentials. Consult the documentation for the specific environment variables required for your data source type.","message":"Credentials for connecting to data sources (e.g., S3, BigQuery, PostgreSQL) are typically provided via environment variables and should not be hardcoded in your `datacontract.yaml` or version control. Each server type has specific environment variable naming conventions (e.g., `DATACONTRACT_S3_ACCESS_KEY_ID`).","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure your environment uses a compatible Python version (3.10, 3.11, or 3.12). Consider using virtual environments (e.g., `venv`, `conda`) or `uv` to manage Python versions and dependencies.","message":"The library requires Python versions >=3.10 and <3.13. Using unsupported Python versions may lead to installation failures or runtime issues.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Rely on `datacontract-cli`'s own dependency management (e.g., by installing with `pip install datacontract-cli[all]`) to ensure compatible versions of its internal engines. If issues arise, check the `datacontract-cli` changelog for specific dependency version pins.","message":"Specific internal dependencies, such as `DuckDB`, may have version restrictions. For example, version 0.11.3 fixed a dependency issue by restricting `DuckDB` to `<1.4.0`. Attempting to use a newer, incompatible version of such internal dependencies can cause failures.","severity":"gotcha","affected_versions":"May vary by `datacontract-cli` patch version"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}