Pulumi Databricks Provider
The `pulumi-databricks` Python package is an Infrastructure as Code (IaC) tool that enables developers to define, deploy, and manage Databricks cloud resources programmatically. It wraps the Databricks Terraform provider, providing Pythonic access to resources such as notebooks, clusters, jobs, and Unity Catalog entities. The library is actively maintained with frequent updates, typically released multiple times per month, reflecting the rapid development of its upstream Terraform provider. The current version is 1.90.0.
Common errors
-
Error: Cannot access cluster ####-######-####### that was terminated or unpinned more than 30 days ago
cause This error occurs when trying to manage a Databricks cluster that has been terminated or unpinned and its metadata is no longer available in the Databricks API, typically after 30 days. The Pulumi state still references this old cluster.fixUpgrade `pulumi-databricks` to v0.5.5 or later if possible. If not, manually remove the cluster from the Pulumi state using `pulumi state rm urn:pulumi:<stack>::<project>::databricks:index/cluster:Cluster::<resource_name>` and then import the current state of your desired cluster, or create a new cluster resource. -
Error: configuration for 'databricks:host' is required
cause The Databricks provider needs to know the host URL of your Databricks workspace to authenticate and interact with it. This configuration is missing.fixSet the `DATABRICKS_HOST` environment variable (e.g., `export DATABRICKS_HOST="https://adb-YOUR_WORKSPACE_ID.1.azuredatabricks.net"`) or configure it via Pulumi config (`pulumi config set databricks:host "https://..."`). -
Error: configuration for 'databricks:token' is required
cause The Databricks provider requires an authentication token (Personal Access Token or similar) to authorize requests to your Databricks workspace.fixSet the `DATABRICKS_TOKEN` environment variable (e.g., `export DATABRICKS_TOKEN="dapiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"`) or configure it securely via Pulumi config (`pulumi config set databricks:token YYYYYYYYYYYYYY --secret`). -
Error: more than one authorization method configured
cause This typically happens when multiple authentication-related environment variables or Pulumi configuration settings are present and conflict (e.g., `DATABRICKS_HOST` and `DATABRICKS_CONFIG_FILE` are both set).fixEnsure you are using only one set of authentication credentials. For example, if using `~/.databrickscfg`, only set `DATABRICKS_CONFIG_FILE` or `databricks:configFile` and `databricks:profile`. If using host/token, only set `DATABRICKS_HOST`/`DATABRICKS_TOKEN` or `databricks:host`/`databricks:token`.
Warnings
- breaking Major version upgrades of the underlying `terraform-provider-databricks` (often monthly) can introduce breaking changes in resource properties, behavior, or required arguments, even if `pulumi-databricks` itself remains at v1.x. Always review the `terraform-provider-databricks` changelog when upgrading `pulumi-databricks`.
- gotcha Using `content_base64` for `databricks.Notebook` is discouraged for large notebooks as it increases the Pulumi state file size and memory footprint. Consider using the `source` argument instead to specify a local file path for the notebook content.
- gotcha When configuring authentication, ensure `databricks:token` is marked as a secret if using `pulumi config set` to prevent it from being stored in plaintext in the Pulumi state file.
- gotcha The `accountId` provider argument should *not* be set for workspace-level providers. Setting it incorrectly can lead to 'invalid Databricks Account configuration errors' when performing workspace-specific operations.
- deprecated The `basic` authentication type for the provider, specified via `databricks:authType`, is deprecated.
Install
-
pip install pulumi-databricks
Imports
- Provider
from pulumi_databricks.provider import Provider
import pulumi_databricks as databricks
- Notebook
from pulumi_databricks import Notebook
- Cluster
from pulumi_databricks import Cluster
Quickstart
import pulumi
import pulumi_databricks as databricks
import os
# Configure Databricks authentication using environment variables
# DATABRICKS_HOST and DATABRICKS_TOKEN are typically set.
# Example: export DATABRICKS_HOST="https://adb-YOUR_WORKSPACE_ID.1.azuredatabricks.net"
# export DATABRICKS_TOKEN="dapiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
# Retrieve current user information (useful for creating resources in user's home directory)
current_user = databricks.get_current_user()
# Create a Databricks Notebook
hello_notebook = databricks.Notebook(
"my-first-notebook",
path=current_user.then(lambda user: f"{user.home}/pulumi_hello_world_notebook"),
language="PYTHON",
content_base64="# Databricks notebook created by Pulumi\nprint('Hello from Pulumi!')\n",
# Note: content_base64 is discouraged for large notebooks; consider 'source' argument instead.
# For real applications, you would typically load content from a file using base64.b64encode(open('path/to/notebook.py', 'rb').read()).decode('utf-8')
)
pulumi.export("notebook_url", hello_notebook.url)