dbt-artifacts-parser
A Python library for parsing dbt artifacts (like `manifest.json`, `run_results.json`, `catalog.json`) into strongly-typed Pydantic objects. It provides Python representations for various dbt artifact schemas, enabling easy programmatic access and manipulation of dbt project metadata and execution results. The library is actively maintained and frequently updated to support the latest stable dbt artifact versions.
Common errors
-
pydantic.v1.error_wrappers.ValidationError: ...
cause The dbt artifact JSON does not conform to the expected Pydantic schema, often due to an incompatible dbt version, an incorrect schema class chosen, or a malformed artifact file.fixVerify that your dbt artifact file is valid and that you are using the correct schema class (e.g., `ManifestV10`) from `dbt_artifacts_parser.schema.*` that precisely matches the `dbt_schema_version` specified within your artifact JSON. Ensure your `dbt-artifacts-parser` library version is compatible with your dbt project's version. -
ModuleNotFoundError: No module named 'dbt_artifacts_parser.parser'
cause The `dbt-artifacts-parser` library is not installed in your Python environment or the import path specified in your code is incorrect.fixInstall the library using `pip install dbt-artifacts-parser`. Double-check your import statement for correctness, e.g., `from dbt_artifacts_parser.parser import parse_artifact`. -
FileNotFoundError: [Errno 2] No such file or directory: 'path/to/manifest.json'
cause The specified path to the dbt artifact file (e.g., `manifest.json`, `run_results.json`) is incorrect, or the file does not exist at the given location.fixVerify the absolute or relative path to your dbt artifact file. Ensure the file is present at the location specified in your code and that your application has read permissions for the file.
Warnings
- gotcha Using an incorrect dbt artifact schema version (e.g., `ManifestV10` when your dbt project output `ManifestV11`). The `dbt-artifacts-parser` library supports specific dbt artifact schema versions. Using a schema class that doesn't match the `dbt_schema_version` found within your artifact JSON will lead to `ValidationError` or incomplete/incorrect parsing.
- breaking Breaking changes due to upstream dbt artifact schema updates. As dbt Labs releases new dbt versions, the underlying artifact schemas (like manifest.json structure) can change. While `dbt-artifacts-parser` aims to keep pace, if you upgrade your dbt project version, you may need to upgrade `dbt-artifacts-parser` and potentially update the specific schema class used in your parsing code (e.g., from `ManifestV10` to `ManifestV11`).
- gotcha Performance and memory usage with very large dbt artifact files. Parsing extremely large `manifest.json` or `run_results.json` files from complex dbt projects can be memory-intensive as the entire artifact is loaded into Python objects. This might lead to high memory consumption or slow processing times.
Install
-
pip install dbt-artifacts-parser
Imports
- parse_artifact
from dbt_artifacts_parser.parser import parse_artifact
- ManifestV10
from dbt_artifacts_parser.manifest import Manifest
from dbt_artifacts_parser.schema.manifest import ManifestV10
- RunResultsV1
from dbt_artifacts_parser.schema.run_results import RunResultsV1
Quickstart
import json
from dbt_artifacts_parser.parser import parse_artifact
from dbt_artifacts_parser.schema.manifest import ManifestV10
# In a real scenario, you'd load your manifest.json file, e.g.:
# with open('path/to/manifest.json', 'r') as f:
# manifest_dict = json.load(f)
# Example minimal valid manifest structure for demonstration
manifest_json_str = """
{
"metadata": {
"dbt_schema_version": "https://schemas.getdbt.com/dbt/manifest/v10.json",
"dbt_version": "1.7.0",
"generated_at": "2023-10-27T10:00:00.000000Z",
"invocation_id": "test_invocation",
"env": {},
"adapter_type": "postgres",
"project_id": "test_project",
"user_id": "test_user",
"send_anonymous_usage_stats": false,
"cdd_id": ""
},
"nodes": {},
"sources": {},
"macros": {},
"docs": {},
"exposures": {},
"metrics": {},
"selectors": {},
"disabled": {},
"files": {},
"unit_tests": {},
"artifacts": {}
}
"""
manifest_dict = json.loads(manifest_json_str)
# Parse the dictionary into a strongly-typed Pydantic object
manifest = parse_artifact(manifest_dict, ManifestV10)
print(f"Parsed dbt Manifest (dbt version: {manifest.metadata.dbt_version})")
print(f"Schema version: {manifest.metadata.dbt_schema_version}")
print(f"Number of nodes: {len(manifest.nodes)}")