Databricks OpenAI Integration
The `databricks-openai` package provides seamless integration of Databricks AI features into OpenAI applications. It extends the standard OpenAI client with Databricks authentication and offers specialized tools like `VectorSearchRetrieverTool` for interacting with Databricks' AI capabilities. Currently at version 0.14.0, it sees active development, with frequent updates to support new OpenAI models and Databricks features.
Warnings
- gotcha When using `DatabricksOpenAI` with non-GPT models (e.g., Claude, Llama) for tool calls, the client automatically strips the 'strict' field from tool definitions. These models do not support this OpenAI-specific parameter, and manual removal is not required.
- gotcha Encountering '429 Too Many Requests' errors (rate limiting) is common when processing large datasets with OpenAI APIs via Databricks. This indicates exceeding the allowed tokens per minute (TPM) or requests per minute (RPM).
- gotcha Failures with SQL `AI_QUERY` function, particularly the `[REMOTE_FUNCTION_HTTP_FAILED_ERROR]` (SQLSTATE: 57012), can indicate issues like prompt policy violations or internal operational problems with certain OpenAI models.
- gotcha Common 500-level errors when invoking models via Databricks often stem from mismatches in endpoint names, incorrect API keys, resource names, deployment names, or insufficient permissions. This applies to both `databricks-openai` and other integration packages.
- gotcha When using the OpenAI Responses API on Databricks, specific parameters like `background`, `store`, `previous_response_id`, and `service_tier` are not supported for pay-per-token foundation models. External models generally support all parameters.
Install
-
pip install databricks-openai openai -
pip install databricks-openai[memory]
Imports
- DatabricksOpenAI
from databricks_openai import DatabricksOpenAI
- VectorSearchRetrieverTool
from databricks_openai import VectorSearchRetrieverTool
- UCFunctionToolkit
from databricks_openai import UCFunctionToolkit
Quickstart
import os
from databricks_openai import DatabricksOpenAI, VectorSearchRetrieverTool
from openai.types.chat import ChatCompletionMessageParam
# Ensure your Databricks host and token are set as environment variables
# DATABRICKS_HOST='https://<your-workspace-url>.cloud.databricks.com'
# DATABRICKS_TOKEN='dapi...'
# Initialize the Databricks OpenAI client
# It automatically picks up DATABRICKS_HOST and DATABRICKS_TOKEN from environment variables
# or Databricks CLI configuration.
client = DatabricksOpenAI(
host=os.environ.get('DATABRICKS_HOST', ''),
token=os.environ.get('DATABRICKS_TOKEN', '')
)
# Example 1: Simple chat completion using a Databricks-hosted model
try:
chat_completion = client.chat.completions.create(
model="databricks-gpt-5-mini", # Or your specific Databricks-hosted endpoint name
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is Databricks Unity Catalog?"}
]
)
print("\n--- Simple Chat Completion ---")
print(f"Assistant: {chat_completion.choices[0].message.content}")
except Exception as e:
print(f"Error during simple chat completion: {e}")
# Example 2: Chat completion with a Vector Search Retriever Tool
# Replace 'catalog.schema.my_index_name' with your actual Vector Search index name
index_name = os.environ.get('DATABRICKS_VECTOR_SEARCH_INDEX', 'catalog.schema.my_index_name')
try:
dbvs_tool = VectorSearchRetrieverTool(index_name=index_name)
messages: list[ChatCompletionMessageParam] = [
{"role": "system", "content": "You are a helpful assistant that uses provided tools."},
{"role": "user", "content": "Using the Databricks documentation, what is Spark?"}
]
first_response = client.chat.completions.create(
model="databricks-gpt-5-mini",
messages=messages,
tools=[dbvs_tool.tool]
)
print("\n--- Chat Completion with Tool ---")
tool_call = first_response.choices[0].message.tool_calls[0]
if tool_call.function.name == dbvs_tool.tool.function.name:
args = json.loads(tool_call.function.arguments)
# In a real scenario, this would execute on Databricks
# For demonstration, we simulate a response
# result = dbvs_tool.execute(query=args["query"])
result = {"docs": [{"text": "Apache Spark is an open-source, distributed processing system used for big data workloads."}]}
messages.append(first_response.choices[0].message)
messages.append({"role": "tool", "tool_call_id": tool_call.id, "content": json.dumps(result)})
second_response = client.chat.completions.create(
model="databricks-gpt-5-mini",
messages=messages,
tools=[dbvs_tool.tool] # Tools should still be passed for the second turn
)
print(f"Assistant (using tool): {second_response.choices[0].message.content}")
else:
print(f"Assistant: {first_response.choices[0].message.content}")
except Exception as e:
print(f"Error during tool usage: {e}")