Databricks Labs Blueprint

raw JSON →
0.12.0 verified Tue May 12 auth: no python install: verified

Databricks Labs Blueprint is a Python library that provides common building blocks and utilities for Databricks Labs projects. It offers Python-native pathlib-like interfaces for Databricks Workspace paths, tools for trivial terminal user interfaces (TUI), and utilities for managing application and installation state. The current version is 0.12.0, with a regular release cadence, typically monthly or bi-monthly, reflecting active development and maintenance. [1, 3]

pip install databricks-labs-blueprint
error ModuleNotFoundError: No module named 'blueprint'
cause The 'databricks-labs-blueprint' package, which provides the 'blueprint' module, is not installed or not available in the current Python environment.
fix
Install the package using pip: pip install databricks-labs-blueprint
error ImportError: cannot import name 'WorkspacePath' from 'blueprint'
cause The 'WorkspacePath' class is located within the 'blueprint.path' submodule, not directly under the top-level 'blueprint' package.
fix
Import 'WorkspacePath' from its correct submodule: from blueprint.path import WorkspacePath
error AttributeError: 'WorkspacePath' object has no attribute 'touch'
cause The 'WorkspacePath' class provides a pathlib-like interface but does not implement all methods available in 'pathlib.Path', such as 'touch()'.
fix
Use an alternative method to create an empty file, such as workspace_path.write_text('').
error RuntimeError: dbutils is not available.
cause The Databricks-specific functionalities, like 'WorkspacePath', rely on the 'dbutils' object, which is only available within a Databricks notebook or job environment. This error occurs when the code is run outside Databricks.
fix
Run the code within a Databricks notebook or job environment, or mock 'dbutils' for local testing if appropriate.
breaking In `v0.11.3`, the unmarshalling of JSON floating-point values was fixed. Previously, JSON floats might have been silently truncated to integers. The updated functionality now raises a `SerdeError` when precision would be lost during conversion from float to integer. [release notes]
fix Review code that reads JSON configuration files, especially if it expects integer values from floating-point inputs. Handle `SerdeError` for explicit type conversions.
deprecated Starting with `v0.11.0`, using `Any` and `object` as type annotations on data classes for marshalling is deprecated and will issue a `DeprecationWarning`. [release notes]
fix Refactor data classes to use more specific type annotations instead of `Any` or `object` to avoid deprecation warnings and ensure type safety.
gotcha In versions prior to `v0.9.3`, there was an issue where `databricks-sdk` config objects could be unintentionally overridden when creating installation config files. This could lead to unexpected behavior or incorrect workspace configurations. [release notes]
fix Upgrade to `v0.9.3` or newer. If upgrading is not immediately possible, carefully review any custom logic that modifies or saves Databricks SDK configuration objects to ensure they are handled correctly.
gotcha The `databricks-labs-blueprint` library requires Python 3.10 or newer. Installing it on Python versions older than 3.10 will result in an `ERROR: No matching distribution found`.
fix Ensure your Python environment is version 3.10 or higher before installing `databricks-labs-blueprint`. Upgrade your Python interpreter or use a compatible virtual environment.
python os / libc status wheel install import disk
3.10 alpine (musl) wheel - 3.62s 55.0M
3.10 alpine (musl) - - 3.76s 53.3M
3.10 slim (glibc) wheel 4.8s 2.51s 56M
3.10 slim (glibc) - - 2.42s 54M
3.11 alpine (musl) wheel - 6.44s 60.9M
3.11 alpine (musl) - - 6.81s 59.0M
3.11 slim (glibc) wheel 4.3s 5.21s 62M
3.11 slim (glibc) - - 4.84s 60M
3.12 alpine (musl) wheel - 5.38s 52.2M
3.12 alpine (musl) - - 5.88s 50.4M
3.12 slim (glibc) wheel 3.9s 5.46s 53M
3.12 slim (glibc) - - 5.57s 51M
3.13 alpine (musl) wheel - 5.26s 52.0M
3.13 alpine (musl) - - 5.29s 50.1M
3.13 slim (glibc) wheel 4.0s 4.83s 53M
3.13 slim (glibc) - - 4.98s 51M
3.9 alpine (musl) build_error - - - -
3.9 alpine (musl) - - - -
3.9 slim (glibc) build_error - 1.6s - -
3.9 slim (glibc) - - - -

This quickstart demonstrates how to initialize a `WorkspaceClient` and use `WorkspacePath` to create and manage directories within your Databricks workspace. It expands a relative path to the user's home directory, creates the specified folder structure, verifies its existence, and then cleans it up. Ensure your Databricks SDK authentication (environment variables or CLI profile) is configured before running. [1]

import os
from databricks.sdk import WorkspaceClient
from databricks.labs.blueprint.paths import WorkspacePath

# Ensure DATABRICKS_HOST and DATABRICKS_TOKEN environment variables are set,
# or a Databricks CLI profile is configured.
# For local testing, you might run:
# export DATABRICKS_HOST='https://<your-databricks-instance>.cloud.databricks.com'
# export DATABRICKS_TOKEN='dapi...'

if not os.environ.get('DATABRICKS_HOST') or not os.environ.get('DATABRICKS_TOKEN'):
    print("Please set DATABRICKS_HOST and DATABRICKS_TOKEN environment variables or configure Databricks CLI.")
    # In a real quickstart, you might exit or raise an error here.
    # For demonstration, we'll use placeholder values that will likely fail.
    ws = WorkspaceClient(host=os.environ.get('DATABRICKS_HOST', 'https://example.cloud.databricks.com'), 
                         token=os.environ.get('DATABRICKS_TOKEN', 'dapi-fake-token'))
else:
    ws = WorkspaceClient()

print(f"Initialized WorkspaceClient for host: {ws.host}")

try:
    user_name = ws.current_user.me().user_name
    print(f"Current user: {user_name}")

    # Example: Working with user home folders
    folder_name = 'blueprint-test-folder'
    wsp = WorkspacePath(ws, f"~/{{folder_name}}/sub/dir")

    # Expand the user path and create the directory
    with_user = wsp.expanduser()
    print(f"Expanded path: {with_user}")
    with_user.mkdir()
    print(f"Directory '{with_user}' created.")

    # Verify existence
    assert with_user.is_dir()
    print(f"Directory '{with_user}' exists.")

    # Clean up (recursive rmdir)
    with_user.parent.parent.rmdir(recursive=True)
    assert not with_user.parent.parent.exists()
    print(f"Directory '{with_user.parent.parent}' and its contents removed.")

except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure your Databricks environment is correctly configured (host, token, permissions).")