Unity Catalog OpenAI Tools
The `unitycatalog-openai` library provides support for integrating Databricks Unity Catalog functions as OpenAI tools. This allows large language models to discover and execute Unity Catalog registered functions. The current version is 0.2.0, and being an early-stage library, releases are currently irregular.
Warnings
- breaking The library explicitly requires `openai>=1.0.0` and `databricks-sdk>=0.20.0,<1.0.0`. Using older versions of these dependencies will lead to import errors or runtime issues due to API changes.
- gotcha Databricks authentication requires `DATABRICKS_HOST` and `DATABRICKS_TOKEN` to be set as environment variables. Without these, the `UnityCatalogOpenAIToolFactory` will fail to connect to your Databricks workspace.
- gotcha The Databricks Personal Access Token (PAT) used for `DATABRICKS_TOKEN` must have sufficient permissions to read Unity Catalog metadata (e.g., `USE CATALOG`, `USE SCHEMA`, `SELECT` on functions). Lack of permissions will result in errors when `get_tools()` attempts to list functions.
- breaking As a library in early development (v0.2.0), API surfaces, class names, and method signatures are subject to breaking changes even in minor version increments. Always consult the GitHub repository's README for the latest usage.
Install
-
pip install unitycatalog-openai
Imports
- UnityCatalogOpenAIToolFactory
from unitycatalog_openai_tools import UnityCatalogOpenAIToolFactory
Quickstart
import os
from openai import OpenAI
from unitycatalog_openai_tools import UnityCatalogOpenAIToolFactory
# Ensure environment variables are set:
# OPENAI_API_KEY
# DATABRICKS_HOST (e.g., https://adb-XXXXXXXXXXXXXXXX.XX.databricks.com)
# DATABRICKS_TOKEN (Databricks Personal Access Token)
openai_api_key = os.environ.get('OPENAI_API_KEY', '')
db_host = os.environ.get('DATABRICKS_HOST', '')
db_token = os.environ.get('DATABRICKS_TOKEN', '')
if not all([openai_api_key, db_host, db_token]):
print("Please set OPENAI_API_KEY, DATABRICKS_HOST, and DATABRICKS_TOKEN environment variables.")
exit(1)
# Initialize OpenAI client
client = OpenAI(api_key=openai_api_key)
# Create the Unity Catalog tool factory
# Specify desired catalog and schema
tool_factory = UnityCatalogOpenAIToolFactory(
databricks_host=db_host,
databricks_token=db_token,
catalog_name="main", # Replace with your catalog name
schema_name="default" # Replace with your schema name
)
# Get available tools
uc_tools = tool_factory.get_tools()
# Convert tools to OpenAI format
openai_tools = [tool.openai_function for tool in uc_tools]
# Example: Call OpenAI Chat Completion with tools
messages = [{
"role": "user",
"content": "What functions are available in Unity Catalog?"
}]
try:
response = client.chat.completions.create(
model="gpt-3.5-turbo", # or your preferred tool-calling model
messages=messages,
tools=openai_tools,
tool_choice="auto" # Allow the model to choose if it needs a tool
)
print("OpenAI API response (initial):")
print(response.choices[0].message)
# If the model requests a tool call, execute it
message = response.choices[0].message
if message.tool_calls:
for tool_call in message.tool_calls:
tool_name = tool_call.function.name
tool_arguments = tool_call.function.arguments
print(f"\nModel requested tool: {tool_name} with arguments: {tool_arguments}")
# Find and execute the actual tool function
for tool in uc_tools:
if tool.name == tool_name:
# In a real application, you'd parse arguments and call the tool's underlying function
# For now, just acknowledge and print
print(f"Executing mock call for {tool_name}...")
# result = tool.execute(**json.loads(tool_arguments))
# print(f"Tool result: {result}")
messages.append(message)
messages.append({
"tool_call_id": tool_call.id,
"role": "tool",
"name": tool_name,
"content": "{ 'status': 'success', 'message': 'Function discovery complete.' }"
})
# Make another call to OpenAI with tool output
second_response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=messages,
)
print("\nOpenAI API response (after tool execution):")
print(second_response.choices[0].message.content)
break
except Exception as e:
print(f"An error occurred: {e}")