dagstermill
raw JSON → 0.29.2 verified Mon Apr 27 auth: no python
dagstermill integrates Jupyter notebooks into Dagster pipelines, allowing notebooks to be executed as solid computations with input/output dependencies. Current version 0.29.2, requires Python 3.10-3.14. Release cadence matches dagster core (monthly).
pip install dagstermill Common errors
error ModuleNotFoundError: No module named 'dagstermill' ↓
cause dagstermill not installed or installed in wrong environment.
fix
Run 'pip install dagstermill' in the same Python environment where Dagster runs.
error KeyError: 'DAGSTER_HOME' ↓
cause Dagster requires DAGSTER_HOME environment variable set for persistent run storage.
fix
Set DAGSTER_HOME (e.g., export DAGSTER_HOME=/path/to/dagster_home) or run without persistent storage.
error papermill.exceptions.PapermillMissingParameterException: Notebook does not have a cell with tag 'parameters' ↓
cause Input notebook lacks a tagged parameters cell.
fix
Add a cell with parameter defaults and tag it as 'parameters' in Jupyter Notebook.
error TypeError: define_dagstermill_solid() got an unexpected keyword argument 'output_notebook_name' ↓
cause Using older version of dagstermill (pre-0.14) where the function signature differs.
fix
Upgrade dagstermill: pip install --upgrade dagstermill. Or use define_dagstermill_op.
Warnings
breaking In dagster 1.0+, solids are renamed to ops. Use define_dagstermill_op instead of define_dagstermill_solid. ↓
fix Replace define_dagstermill_solid with define_dagstermill_op and use @op/@job decorators.
gotcha Notebook must have a 'parameters' cell (tagged) to accept inputs. Without it, inputs are silently ignored. ↓
fix In Jupyter, add a cell with default values and tag it as 'parameters' (use Cell Toolbar > Tags).
deprecated Managed notebook execution (Engine/Resource) deprecated in favor of simple context. ↓
fix Remove execution_engine arguments; use default execution.
gotcha Output notebooks are stored in the run's output directory, not necessarily local. Use io_manager to persist. ↓
fix Configure a filesystem or S3 io_manager to capture output notebooks.
deprecated The 'output_notebook' materialization is deprecated; use Dagster's dynamic output instead. ↓
fix Use the 'output_notebook_name' parameter in define_dagstermill_op and handle via op outputs.
Install
pip install dagstermill[pandas] Imports
- define_dagstermill_solid wrong
from dagstermill.core import define_dagstermill_solidcorrectfrom dagstermill import define_dagstermill_solid - op wrong
from dagstermill import solidcorrectfrom dagstermill import op
Quickstart
from dagster import job, op
from dagstermill import define_dagstermill_op
my_notebook_op = define_dagstermill_op(
name='my_notebook_op',
notebook_path='notebooks/my_notebook.ipynb',
output_notebook_name='output.ipynb'
)
@job
def my_job():
my_notebook_op()
if __name__ == '__main__':
result = my_job.execute_in_process()
print(result.success)