dlthub

raw JSON →
0.22.1 verified Mon Apr 27 auth: no python

dlthub is a commercial extension to dlt (data load tool) for data pipelining. Current version 0.22.1, requires Python >=3.9 and <3.15. Released frequently.

pip install dlthub
error ModuleNotFoundError: No module named 'dlthub'
cause dlthub not installed or installed in wrong environment.
fix
Run 'pip install dlthub' in the correct Python environment.
error ImportError: cannot import name 'Pipeline' from 'dlthub'
cause Attempting to import Pipeline from dlthub instead of dlt.
fix
Use 'from dlt import Pipeline'.
error TypeError: Pipeline.__init__() got an unexpected keyword argument 'restore_from_destination'
cause Using outdated dlt version when using dlthub features.
fix
Upgrade both dlt and dlthub: 'pip install --upgrade dlt dlthub'.
error dlthub.exceptions.AuthenticationError: API key not provided
cause Trying to use a commercial feature that requires authentication.
fix
Set the DLT_HUB_API_KEY environment variable or provide it in configuration.
gotcha Importing dlthub directly may not be necessary for basic usage; it's a commercial extension for advanced features like monitoring, orchestration, and enterprise support. Most standard dlt operations do not require importing dlthub.
fix Only import dlthub when you need its commercial features.
deprecated Older versions of dlthub might have required different import paths (e.g., from dlthub import something).
fix Use current pattern: install dlthub and just import dlthub to activate extensions. No specific symbol imports.
breaking Version 0.22.1 drops support for Python 3.8 and above 3.15, may break existing environments.
fix Ensure Python version is >=3.9 and <3.15.

Basic pipeline using dlt and importing dlthub for commercial extensions.

import dlt
from dlt import Pipeline
import dlthub

pipeline = Pipeline(
    pipeline_name='quickstart',
    destination='duckdb',
    dataset_name='data'
)

@dlt.resource
def my_resource():
    for i in range(10):
        yield {'id': i, 'value': f'row_{i}'}

info = pipeline.run(my_resource())
print(info)