Collate dbt Artifacts Parser
Collate dbt Artifacts Parser is a Python library that provides a structured way to parse dbt artifacts like `manifest.json` and `run_results.json` into Pydantic models. It leverages `dbt-artifacts-parser` and simplifies access to dbt project metadata. The current version is 0.1.4, and its release cadence is tied to internal dbt-labs development, though it's publicly available.
Common errors
-
FileNotFoundError: [Errno 2] No such file or directory: 'path/to/your/manifest.json'
cause The specified artifact file path does not exist or is incorrect.fixVerify the path to your dbt artifact file. Use an absolute path or ensure the relative path is correct from where your script is executed. -
ValueError: Invalid artifact type or version combination: manifest/vX
cause The `artifact_type` or `version` parameter provided to `DbtArtifactParser` does not match a known/supported dbt artifact schema.fixDouble-check that `artifact_type` is one of 'manifest', 'run_results', 'catalog', etc., and `version` correctly corresponds to the dbt schema version (e.g., 'v8' for `dbt/manifest/v8.json`). Ensure consistency between the artifact type, its version, and the actual content of the file. -
pydantic.error_wrappers.ValidationError: 1 validation error for ManifestV8 metadata -> dbt_schema_version value is not a valid enumeration member; permitted: ('https://schemas.getdbt.com/dbt/manifest/v8.json', ...)cause The content of the dbt artifact file does not conform to the schema specified by the chosen `artifact_type` and `version` (e.g., trying to parse a dbt v1.0 manifest with a `version='v8'` parser).fixEnsure the `version` parameter in `DbtArtifactParser` accurately reflects the `dbt_schema_version` found within the artifact file itself. If you're unsure, inspect the `dbt_schema_version` field in your artifact JSON.
Warnings
- gotcha dbt artifact schema versions (`manifest.json`, `run_results.json`) change frequently with dbt-core releases. Ensure the `version` parameter passed to `DbtArtifactParser` (e.g., 'v8') matches the `dbt_schema_version` within your artifact file (e.g., 'https://schemas.getdbt.com/dbt/manifest/v8.json'). Mismatches will lead to parsing errors.
- gotcha This library relies heavily on `dbt-artifacts-parser` for the core schema definitions and parsing logic. While `collate-dbt-artifacts-parser` specifies `dbt-artifacts-parser==1.0.2`, future changes in `dbt-artifacts-parser` or `dbt-core`'s artifact schemas could introduce incompatibilities.
- gotcha The library's GitHub repository (`dbt-labs-internal`) indicates its origin as an internal dbt-labs project. While publicly available, this might imply its primary development focus is internal use cases, potentially leading to less emphasis on backward compatibility for external users or a slower response to external feature requests/bug reports.
Install
-
pip install collate-dbt-artifacts-parser
Imports
- DbtArtifactParser
from collate_dbt_artifacts_parser.parsers import DbtArtifactParser
- parse_artifact
from collate_dbt_artifacts_parser.parser import parse_artifact
Quickstart
import json
import os
from collate_dbt_artifacts_parser.parsers import DbtArtifactParser
# Mock a simple manifest.json structure (truncated for brevity)
mock_manifest_content = {
"metadata": {
"dbt_schema_version": "https://schemas.getdbt.com/dbt/manifest/v8.json",
"dbt_version": "1.8.0",
"generated_at": "2024-01-01T00:00:00Z",
"invocation_id": "mock_invocation",
"env": {},
"adapter_type": "postgres"
},
"nodes": {},
"sources": {},
"macros": {}
}
# Create a dummy file
dummy_file_path = "mock_manifest.json"
with open(dummy_file_path, "w") as f:
json.dump(mock_manifest_content, f)
try:
# Initialize parser for 'manifest' artifact of schema version 'v8'
parser = DbtArtifactParser(artifact_type="manifest", version="v8")
manifest_model = parser.parse_artifact_file(dummy_file_path)
print(f"Successfully parsed manifest of type: {type(manifest_model)}")
print(f"DBT Version from manifest metadata: {manifest_model.metadata.dbt_version}")
except Exception as e:
print(f"An error occurred during parsing: {e}")
finally:
# Clean up the dummy file
if os.path.exists(dummy_file_path):
os.remove(dummy_file_path)