Unity Catalog LangChain Integration

0.3.0 · active · verified Sat Apr 11

The `unitycatalog-langchain` library provides seamless integration of Unity Catalog (UC) functions as tools within LangChain agent applications. It enables developers to define and manage AI functions in Unity Catalog and utilize them across various GenAI platforms, including LangChain, LlamaIndex, OpenAI, and Anthropic. The current version is 0.3.0 and it is actively maintained with a focus on interoperability and secure access control for AI assets.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to set up the `unitycatalog-langchain` client, define a Python function, register it (conceptually, as a live UC server is needed for actual registration), retrieve it via `UCFunctionToolkit`, and integrate it into a LangChain agent. It assumes an existing Unity Catalog server and requires `OPENAI_API_KEY` for the LLM example. Replace placeholder values like `UC_HOST`, `UC_CATALOG`, `UC_SCHEMA` with your actual Unity Catalog configuration.

import os
from unitycatalog.client import ApiClient, Configuration
from unitycatalog.ai.core.client import UnitycatalogFunctionClient
from unitycatalog.langchain import UCFunctionToolkit
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI # Using OpenAI for demonstration

# --- Configuration for Unity Catalog Client ---
# Replace with your actual Unity Catalog host and token
UC_HOST = os.environ.get('UC_HOST', 'http://localhost:8080/api/2.1/unity-catalog')
# UC_TOKEN is often not directly used in 'http://localhost' setups, but for remote UC, it's essential.
# You might use Databricks personal access token or similar. For local testing, usually not needed.
# UC_TOKEN = os.environ.get('UC_TOKEN', 'YOUR_UC_AUTH_TOKEN') 

config = Configuration(host=UC_HOST)
# If a token is required, you might set it like: config.access_token = UC_TOKEN
api_client = ApiClient(configuration=config)
uc_client = UnitycatalogFunctionClient(api_client=api_client)

# --- Define a simple Python function to be registered with Unity Catalog ---
def add_numbers(number_1: float, number_2: float) -> float:
    """A function that accepts two floating point numbers, adds them, and returns the resulting sum as a float.

    Args:
        number_1 (float): The first of the two numbers to add.
        number_2 (float): The second of the two numbers to add.

    Returns:
        float: The sum of the two input numbers.
    """
    return number_1 + number_2

# --- Register the function with Unity Catalog (example placeholders) ---
CATALOG = os.environ.get('UC_CATALOG', 'my_catalog')
SCHEMA = os.environ.get('UC_SCHEMA', 'my_schema')

# In a real scenario, ensure CATALOG and SCHEMA exist in your UC instance.
# The `create_python_function` would typically be called once to register.
# For a runnable quickstart, we'll mock its presence or assume it's pre-registered.
# If running against a real UC, you'd uncomment and run this once:
# function_info = uc_client.create_python_function(
#     func=add_numbers,
#     catalog=CATALOG,
#     schema=SCHEMA,
#     replace=True
# )
# print(f"Registered function: {function_info.name}")

# --- Retrieve the function and create a LangChain toolkit ---
# Assuming 'add_numbers' is already registered in 'my_catalog.my_schema'
function_reference = f"{CATALOG}.{SCHEMA}.add_numbers"
toolkit = UCFunctionToolkit(client=uc_client, function_names=[function_reference])

# --- Use the tool in a LangChain Agent ---
llm = ChatOpenAI(openai_api_key=os.environ.get('OPENAI_API_KEY'))

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Make sure to use tools for additional functionality."),
    ("placeholder", "{chat_history}"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, toolkit.get_tools(), prompt)
agent_executor = AgentExecutor(agent=agent, tools=toolkit.get_tools(), verbose=True)

response = agent_executor.invoke({"input": "What is 36939.0 + 8922.4?"})
print(response)

view raw JSON →