Databricks PyPI Extras

0.1 · active · verified Thu Apr 16

`databricks-pypi-extras` is a Python library developed by Databricks, providing a collection of utilities designed to enhance the Databricks user experience and extend existing PyPI libraries for better compatibility with Databricks environments. It currently includes modules like `databricks.connect_extras` to simplify operations such as connecting to Databricks (e.g., via Databricks Connect v2) and interacting with notebook contexts. The current version is 0.1, with releases likely occurring as new utilities are developed and integrated.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `current_spark_context` from `databricks.connect_extras` to obtain a SparkSession. This utility is particularly useful when working with Databricks Connect v2 or directly within a Databricks notebook environment. Outside these contexts, it requires `databricks-connect` to be installed and configured.

import os
from databricks.connect_extras.context import current_spark_context

# This example demonstrates retrieving a Spark session using databricks.connect_extras.
# To run this successfully outside a Databricks Notebook, you must:
# 1. Install Databricks Connect: pip install "databricks-connect[databricks-connect-dependencies]"
# 2. Configure Databricks Connect using `databricks-connect configure`
#    or by setting environment variables (DATABRICKS_HOST, DATABRICKS_TOKEN, DATABRICKS_CLUSTER_ID, etc.).

print("Attempting to get Spark session via databricks.connect_extras...")
try:
    spark = current_spark_context()
    if spark:
        print(f"Successfully retrieved SparkSession (Spark version: {spark.version})")
        # Example usage: create a simple DataFrame
        data = [("Alice", 1), ("Bob", 2), ("Charlie", 3)]
        df = spark.createDataFrame(data, ["Name", "Value"])
        print("\nExample DataFrame created and shown:")
        df.show()
    else:
        print("Spark session could not be retrieved. Ensure Databricks Connect is properly configured or run in a Databricks Notebook.")
except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure `databricks-connect` is installed and configured, and your environment is set up for Databricks Connect v2.")

view raw JSON →