Unity Catalog AI Python Library
The `unitycatalog-ai` library is the official Python client for integrating with Unity Catalog's AI capabilities, particularly for managing and executing AI functions across various Generative AI tools. It simplifies the creation, registration, and use of Python functions as tools for AI agents, promoting unified governance and access control for data and AI assets. The current version is 0.3.2, with active development and frequent releases.
Warnings
- breaking The APIs for Unity Catalog, and by extension `unitycatalog-ai`, are currently evolving and should not be assumed to be stable. This implies that future versions may introduce breaking changes without a major version bump, requiring code adjustments.
- gotcha Python functions registered with Unity Catalog have specific requirements: all arguments and return values must have type hints, and the docstring must follow Google-style guidelines (including descriptions for the function, arguments, and return). Additionally, any non-core Python library imports *must* be defined within the function body itself, not globally.
- gotcha Unity Catalog functions, particularly from version 0.3.0+, run by default in a 'sandbox' execution mode. This mode isolates code execution for security but imposes CPU/memory limits and blocks access to certain system modules (e.g., `sys`, `subprocess`, `ctypes`, `socket`, `importlib`, `pickle`, `marshall`, `shutil`).
- gotcha The `unitycatalog-client` SDK, which `unitycatalog-ai` builds upon, is asynchronous (aiohttp-based). While `unitycatalog-ai` provides synchronous wrappers, direct asynchronous calls require careful management of the event loop. In environments like Jupyter Notebooks, attempting to create additional event loops for async method calls can lead to errors.
- gotcha The `unitycatalog-ai` library requires a running Unity Catalog server (either a local open-source instance or a Databricks-managed service) to function. The Python library itself does not provide the server.
Install
-
pip install unitycatalog-ai -
pip install unitycatalog-client unitycatalog-ai
Imports
- UnitycatalogFunctionClient
from unitycatalog.ai.core.client import UnitycatalogFunctionClient
- ApiClient
from unitycatalog.client import ApiClient
- Configuration
from unitycatalog.client import Configuration
Quickstart
import asyncio
from unitycatalog.ai.core.client import UnitycatalogFunctionClient
from unitycatalog.client import ApiClient, Configuration
import os
# --- Configuration ---
# Replace with your Unity Catalog server host or Databricks host
# For local OSS server, typically 'http://localhost:8080/api/2.1/unity-catalog'
UC_HOST = os.environ.get('UC_HOST', 'http://localhost:8080/api/2.1/unity-catalog')
# For Databricks, typically 'https://<your-databricks-instance>/api/2.1/unity-catalog'
# Ensure you have DBR 16.4+ and serverless compute enabled for Databricks.
# Replace with your Databricks token if connecting to a Databricks-managed UC
DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN', 'dapi...')
CATALOG = os.environ.get('UC_CATALOG', 'my_catalog')
SCHEMA = os.environ.get('UC_SCHEMA', 'my_schema')
# Configure the Unity Catalog API client
config = Configuration(
host=UC_HOST,
access_token=DATABRICKS_TOKEN if 'databricks.net' in UC_HOST else None
)
api_client = ApiClient(configuration=config)
# Use the UnityCatalog client to create an instance of the AI function client
client = UnitycatalogFunctionClient(api_client=api_client)
# --- Define a Python function to be registered ---
# Requirements: type hints for all arguments and return, Google-style docstring.
# External imports must be within the function body for execution in sandbox mode.
def add_numbers(number_1: float, number_2: float) -> float:
"""
A function that accepts two floating point numbers, adds them,
and returns the resulting sum as a float.
Args:
number_1 (float): The first number.
number_2 (float): The second number.
Returns:
float: The sum of the two numbers.
"""
return number_1 + number_2
async def main():
try:
# Register the Python function in Unity Catalog
print(f"Registering function {CATALOG}.{SCHEMA}.add_numbers...")
created_function = await client.create_python_function(
func=add_numbers,
catalog=CATALOG,
schema=SCHEMA,
name="add_numbers",
replace=True # Overwrite if already exists
)
print(f"Function registered: {created_function.name}")
# Execute the registered function
print(f"Executing function {created_function.name}...")
result = await client.execute_function(
name=created_function.name,
catalog=CATALOG,
schema=SCHEMA,
parameters={'number_1': 10.5, 'number_2': 20.3}
)
print(f"Execution result: {result.result}")
# Clean up (optional: delete the function)
# print(f"Deleting function {created_function.name}...")
# await client.delete_function(
# name=created_function.name,
# catalog=CATALOG,
# schema=SCHEMA
# )
# print("Function deleted.")
except Exception as e:
print(f"An error occurred: {e}")
if __name__ == "__main__":
# The client is async-based, so run in an event loop
asyncio.run(main())