Databricks Vector Search Client

0.67 · active · verified Fri Apr 10

The `databricks-vectorsearch` Python client library provides programmatic access to Databricks Vector Search, a serverless similarity search engine. It enables users to manage vector search endpoints and indexes, facilitating the creation of Retrieval Augmented Generation (RAG) applications. The library integrates seamlessly with the Databricks Data Intelligence Platform and Unity Catalog for data governance. Currently at version 0.67, it is actively developed with regular updates.

Warnings

Install

Imports

Quickstart

Initializes the VectorSearchClient and performs a similarity search on an existing index. Ensure `DATABRICKS_WORKSPACE_URL` and `DATABRICKS_TOKEN` environment variables are set or replaced with actual values.

import os
from databricks.vector_search.client import VectorSearchClient

# Replace with your Databricks workspace URL and Personal Access Token
workspace_url = os.environ.get("DATABRICKS_WORKSPACE_URL", "https://<your-workspace-url>.databricks.com")
pat = os.environ.get("DATABRICKS_TOKEN", "")

if not pat:
    print("Warning: DATABRICKS_TOKEN environment variable not set. This example may not work without authentication.")

# Initialize the Vector Search Client
client = VectorSearchClient(workspace_url=workspace_url, personal_access_token=pat)

# Replace with your endpoint and index names
endpoint_name = "<your-vector-search-endpoint>"
index_name = "<your-catalog>.<your-schema>.<your-index-name>"

try:
    # Get an existing index instance
    index = client.get_index(index_name=index_name)

    # Perform a similarity search
    query_text = "What is Databricks Vector Search?"
    results = index.similarity_search(
        query_text=query_text,
        num_results=2,
        columns=["text", "id"] # specify columns to return
    )

    print(f"Similarity search results for '{query_text}':")
    for result in results.get('result', {}).get('data_array', []):
        print(f"  ID: {result}, Text: {result[:100]}...")

except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure the Vector Search endpoint and index exist and your credentials are correct.")

view raw JSON →