{"id":14521,"library":"databricks-pypi-extras","title":"Databricks PyPI Extras","description":"`databricks-pypi-extras` is a Python library developed by Databricks, providing a collection of utilities designed to enhance the Databricks user experience and extend existing PyPI libraries for better compatibility with Databricks environments. It currently includes modules like `databricks.connect_extras` to simplify operations such as connecting to Databricks (e.g., via Databricks Connect v2) and interacting with notebook contexts. The current version is 0.1, with releases likely occurring as new utilities are developed and integrated.","status":"active","version":"0.1","language":"en","source_language":"en","source_url":"https://github.com/databricks/databricks-pypi-extras","tags":["Databricks","Spark","Databricks Connect","Utilities","Productivity"],"install":[{"cmd":"pip install databricks-pypi-extras","lang":"bash","label":"Install core library"}],"dependencies":[{"reason":"Required for utilities in `databricks.connect_extras` module to function, particularly for Databricks Connect v2 features.","package":"databricks-connect","optional":true}],"imports":[{"note":"A utility to get the active SparkSession, typically from Databricks Connect or a Databricks notebook environment.","symbol":"current_spark_context","correct":"from databricks.connect_extras.context import current_spark_context"},{"note":"A utility to retrieve the path of the current Databricks notebook.","symbol":"get_current_notebook_path","correct":"from databricks.connect_extras.notebook import get_current_notebook_path"}],"quickstart":{"code":"import os\nfrom databricks.connect_extras.context import current_spark_context\n\n# This example demonstrates retrieving a Spark session using databricks.connect_extras.\n# To run this successfully outside a Databricks Notebook, you must:\n# 1. Install Databricks Connect: pip install \"databricks-connect[databricks-connect-dependencies]\"\n# 2. Configure Databricks Connect using `databricks-connect configure`\n#    or by setting environment variables (DATABRICKS_HOST, DATABRICKS_TOKEN, DATABRICKS_CLUSTER_ID, etc.).\n\nprint(\"Attempting to get Spark session via databricks.connect_extras...\")\ntry:\n    spark = current_spark_context()\n    if spark:\n        print(f\"Successfully retrieved SparkSession (Spark version: {spark.version})\")\n        # Example usage: create a simple DataFrame\n        data = [(\"Alice\", 1), (\"Bob\", 2), (\"Charlie\", 3)]\n        df = spark.createDataFrame(data, [\"Name\", \"Value\"])\n        print(\"\\nExample DataFrame created and shown:\")\n        df.show()\n    else:\n        print(\"Spark session could not be retrieved. Ensure Databricks Connect is properly configured or run in a Databricks Notebook.\")\nexcept Exception as e:\n    print(f\"An error occurred: {e}\")\n    print(\"Please ensure `databricks-connect` is installed and configured, and your environment is set up for Databricks Connect v2.\")\n","lang":"python","description":"This quickstart demonstrates how to use `current_spark_context` from `databricks.connect_extras` to obtain a SparkSession. This utility is particularly useful when working with Databricks Connect v2 or directly within a Databricks notebook environment. Outside these contexts, it requires `databricks-connect` to be installed and configured."},"warnings":[{"fix":"Ensure your code is running within a Databricks Notebook or that your local environment is correctly configured with `databricks-connect` if using `connect_extras` features.","message":"Many utilities in `databricks-pypi-extras` (especially in `connect_extras`) are designed for specific Databricks environments (e.g., Databricks Notebooks or Databricks Connect v2). Running them outside these environments might lead to `None` returns, errors, or unexpected behavior.","severity":"gotcha","affected_versions":"0.1"},{"fix":"Pin your dependency to `databricks-pypi-extras==0.1` and monitor the GitHub repository's `main` branch or future releases for API updates before upgrading to newer versions. Be prepared to adapt your code when upgrading.","message":"As the library is currently at version 0.1, its API is considered experimental and highly subject to change. Future versions may introduce breaking changes to existing modules, functions, or their signatures without prior deprecation cycles.","severity":"breaking","affected_versions":"0.1"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure you have installed the library using `pip install databricks-pypi-extras`. Verify the import path against the official GitHub repository's `src` directory, as the library is modular and may introduce new sub-packages.","cause":"The `databricks-pypi-extras` library or a specific submodule was not installed, or the import path is incorrect.","error":"ModuleNotFoundError: No module named 'databricks.connect_extras.context'"},{"fix":"Install `databricks-connect` (`pip install \"databricks-connect[databricks-connect-dependencies]\"`) and configure it using `databricks-connect configure`, or run your code directly within a Databricks notebook.","cause":"Attempting to use `current_spark_context()` or similar utilities from `databricks.connect_extras` outside of a properly configured Databricks Connect environment or a Databricks Notebook.","error":"Spark session could not be retrieved. Ensure Databricks Connect is properly configured or run in a Databricks Notebook."}],"ecosystem":"pypi"}