Unity Catalog LangChain Integration
The `unitycatalog-langchain` library provides seamless integration of Unity Catalog (UC) functions as tools within LangChain agent applications. It enables developers to define and manage AI functions in Unity Catalog and utilize them across various GenAI platforms, including LangChain, LlamaIndex, OpenAI, and Anthropic. The current version is 0.3.0 and it is actively maintained with a focus on interoperability and secure access control for AI assets.
Warnings
- breaking LangChain 1.0 introduced significant breaking changes in its API, particularly for agents and prompt engineering. If upgrading from `langchain` v0.x, code using `initialize_agent` or older prompt formats will break.
- gotcha When defining Python functions for Unity Catalog, all arguments and the return value must be properly typed, and the docstring should follow Google-style guidelines including descriptions for the function, arguments, and return. Failure to do so can lead to incorrect interpretation by the LLM.
- gotcha For local Unity Catalog server setups, ensure the `UC_HOST` is correctly configured (e.g., `http://localhost:8080/api/2.1/unity-catalog`). For Databricks Unity Catalog, the client initialization may differ, often using `DatabricksFunctionClient` or `get_uc_function_client` without manually specifying host/token if running within a Databricks environment.
Install
-
pip install unitycatalog-langchain -
pip install unitycatalog-langchain[databricks]
Imports
- UCFunctionToolkit
from unitycatalog.langchain import UCFunctionToolkit
- ApiClient
from unitycatalog.client import ApiClient
- Configuration
from unitycatalog.client import Configuration
- UnitycatalogFunctionClient
from unitycatalog.ai.core.client import UnitycatalogFunctionClient
- DatabricksFunctionClient
from unitycatalog.ai.core.databricks import DatabricksFunctionClient
- get_uc_function_client
from unitycatalog.ai.core.base import get_uc_function_client
Quickstart
import os
from unitycatalog.client import ApiClient, Configuration
from unitycatalog.ai.core.client import UnitycatalogFunctionClient
from unitycatalog.langchain import UCFunctionToolkit
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI # Using OpenAI for demonstration
# --- Configuration for Unity Catalog Client ---
# Replace with your actual Unity Catalog host and token
UC_HOST = os.environ.get('UC_HOST', 'http://localhost:8080/api/2.1/unity-catalog')
# UC_TOKEN is often not directly used in 'http://localhost' setups, but for remote UC, it's essential.
# You might use Databricks personal access token or similar. For local testing, usually not needed.
# UC_TOKEN = os.environ.get('UC_TOKEN', 'YOUR_UC_AUTH_TOKEN')
config = Configuration(host=UC_HOST)
# If a token is required, you might set it like: config.access_token = UC_TOKEN
api_client = ApiClient(configuration=config)
uc_client = UnitycatalogFunctionClient(api_client=api_client)
# --- Define a simple Python function to be registered with Unity Catalog ---
def add_numbers(number_1: float, number_2: float) -> float:
"""A function that accepts two floating point numbers, adds them, and returns the resulting sum as a float.
Args:
number_1 (float): The first of the two numbers to add.
number_2 (float): The second of the two numbers to add.
Returns:
float: The sum of the two input numbers.
"""
return number_1 + number_2
# --- Register the function with Unity Catalog (example placeholders) ---
CATALOG = os.environ.get('UC_CATALOG', 'my_catalog')
SCHEMA = os.environ.get('UC_SCHEMA', 'my_schema')
# In a real scenario, ensure CATALOG and SCHEMA exist in your UC instance.
# The `create_python_function` would typically be called once to register.
# For a runnable quickstart, we'll mock its presence or assume it's pre-registered.
# If running against a real UC, you'd uncomment and run this once:
# function_info = uc_client.create_python_function(
# func=add_numbers,
# catalog=CATALOG,
# schema=SCHEMA,
# replace=True
# )
# print(f"Registered function: {function_info.name}")
# --- Retrieve the function and create a LangChain toolkit ---
# Assuming 'add_numbers' is already registered in 'my_catalog.my_schema'
function_reference = f"{CATALOG}.{SCHEMA}.add_numbers"
toolkit = UCFunctionToolkit(client=uc_client, function_names=[function_reference])
# --- Use the tool in a LangChain Agent ---
llm = ChatOpenAI(openai_api_key=os.environ.get('OPENAI_API_KEY'))
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant. Make sure to use tools for additional functionality."),
("placeholder", "{chat_history}"),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
])
agent = create_tool_calling_agent(llm, toolkit.get_tools(), prompt)
agent_executor = AgentExecutor(agent=agent, tools=toolkit.get_tools(), verbose=True)
response = agent_executor.invoke({"input": "What is 36939.0 + 8922.4?"})
print(response)