Nutter: Databricks Notebook Testing Library

0.1.35 · active · verified Mon Apr 13

Nutter is a Python library developed by Microsoft for robust unit and integration testing of Databricks notebooks. It streamlines the testing workflow for data and machine learning engineers by providing a framework to define test fixtures within notebooks. Nutter comprises two main components: the Nutter Runner (server-side, installed on Databricks clusters) and the Nutter CLI (client-side for local or CI/CD execution). It integrates easily with CI/CD pipelines like Azure DevOps, facilitating automated testing and quality assurance for Databricks-based data pipelines. The current version is 0.1.35, with a regular release cadence addressing enhancements and bug fixes.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to create a Nutter test fixture within a Databricks notebook. It defines a class inheriting from `NutterFixture`, sets up `before_all` and `after_all` methods, and includes `assertion_` prefixed methods for individual test cases. The tests are executed by calling `execute_tests()` on an instance of the fixture.

# Save this as a Databricks notebook, e.g., 'test_my_notebook'

%pip install nutter

from runtime.nutterfixture import NutterFixture
import os

class MyNotebookTestFixture(NutterFixture):
    def __init__(self):
        super().__init__()
        # Initialize any test-specific variables or parameters
        self.expected_value = 42

    def before_all(self):
        # This method runs once before all assertion methods.
        # Typically, you would run the notebook under test here.
        # For simplicity, we'll simulate a result.
        # Example: dbutils.notebook.run('path/to/notebook_under_test', 600, {'param1': 'value1'})
        self.actual_result = self.expected_value # Simulate successful notebook execution

    def assertion_check_result_matches_expected(self):
        # Nutter discovers methods prefixed with 'assertion_' as test cases.
        assert self.actual_result == self.expected_value, "The result should match the expected value"

    def assertion_ensure_truthy_condition(self):
        # Another example test case
        assert True, "This condition should always be true"

    def after_all(self):
        # This method runs once after all assertion methods have completed.
        # Use it for cleanup, if necessary.
        print("All tests completed for MyNotebookTestFixture.")

# Instantiate and execute the test fixture
result = MyNotebookTestFixture().execute_tests()
print(result.to_string())

# Optional: Exit with a non-zero status in a Databricks job if tests fail
# This is crucial for CI/CD pipelines to correctly report failures.
# In a Databricks job, the environment variable 'DATABRICKS_IS_JOB' might be set,
# or you can infer it from dbutils.notebook.entry_point.getDbutils().notebook().getContext().currentRunId().isDefined()
# For local testing, this print acts as an indicator.
# In a real Databricks job, you'd use dbutils.notebook.exit() if result.is_success is False.
if os.environ.get('DATABRICKS_IS_JOB_RUN', 'False').lower() == 'true' and not result.is_success:
    print("Tests failed in job context. Exiting with failure code.")
    # Example for actual job exit (requires dbutils):
    # dbutils.notebook.exit("Tests failed")

view raw JSON →