Acryl DataHub Dagster Plugin
raw JSON → 1.5.0.17 verified Fri May 01 auth: no python
A Dagster plugin that captures pipeline execution metadata and sends it to DataHub for data lineage and observability. Current version: 1.5.0.17. Requires Python >=3.10. Released as part of the DataHub project.
pip install acryl-datahub-dagster-plugin Common errors
error ModuleNotFoundError: No module named 'acryl_datahub_dagster_plugin' ↓
cause Incorrect import path; the correct module name is 'datahub_dagster_plugin' (underscore).
fix
Change imports to use 'datahub_dagster_plugin' instead of 'acryl_datahub_dagster_plugin'.
error ModuleNotFoundError: No module named 'datahub_dagster_plugin' ↓
cause The plugin is not installed or the deprecated package is installed. Ensure you have installed 'acryl-datahub-dagster-plugin'.
fix
Run 'pip install acryl-datahub-dagster-plugin' and verify installation.
error ImportError: cannot import name 'DatahubDagsterResource' from 'datahub_dagster_plugin' ↓
cause The class might be in a submodule. In recent versions, it is in 'datahub_dagster_plugin.resources'.
fix
Use: 'from datahub_dagster_plugin.resources import DatahubDagsterResource'.
Warnings
gotcha The plugin uses underscore in the import path ('datahub_dagster_plugin') despite the PyPI name having hyphens ('acryl-datahub-dagster-plugin'). Many users mistakenly import from 'acryl_datahub_dagster_plugin'. ↓
fix Use 'from datahub_dagster_plugin.hooks import DatahubDagsterHook' (or .resources).
breaking In version 1.0.0, the plugin was rewritten to use the new DataHub Python SDK (acryl-datahub). The old 'datahub-dagster-plugin' is deprecated and removed. Users must migrate to 'acryl-datahub-dagster-plugin' and update imports. ↓
fix Uninstall the old 'datahub-dagster-plugin' and install 'acryl-datahub-dagster-plugin'. Update imports from 'datahub_dagster_plugin' to 'acryl_datahub_dagster_plugin' (but note the actual module path is 'datahub_dagster_plugin' - check documentation).
deprecated The 'datahub-dagster-plugin' (without 'acryl-') is deprecated and no longer maintained. Users should switch to 'acryl-datahub-dagster-plugin'. ↓
fix Use 'pip install acryl-datahub-dagster-plugin' and update imports accordingly.
Imports
- DatahubDagsterHook wrong
from acryl_datahub_dagster_plugin.hooks import DatahubDagsterHookcorrectfrom datahub_dagster_plugin.hooks import DatahubDagsterHook - DatahubDagsterResource wrong
from acryl_datahub_dagster_plugin.resources import DatahubDagsterResourcecorrectfrom datahub_dagster_plugin.resources import DatahubDagsterResource - datahub_emitter
from datahub.emitter.rest_emitter import DatahubRestEmitter
Quickstart
import os
from dagster import job, op, OpExecutionContext
from datahub_dagster_plugin.resources import DatahubDagsterResource
from datahub.emitter.rest_emitter import DatahubRestEmitter
@op(required_resource_keys={'datahub'})
def my_op(context: OpExecutionContext):
context.log.info("Running op")
return 1
@job(resource_defs={
'datahub': DatahubDagsterResource(
emitter=DatahubRestEmitter(gms_server=os.environ.get('DATAHUB_GMS_HOST', 'http://localhost:8080'))
)
})
def my_job():
my_op()
if __name__ == '__main__':
result = my_job.execute_in_process()