Needle Python Client Library

0.6.0 · active · verified Thu Apr 16

needle-python is the official client library for the Needle API. It simplifies the process of building Retrieval-Augmented Generation (RAG) pipelines, enabling semantic search and efficient contextualization for Large Language Models (LLMs). The library, currently at version 0.6.0, provides tools for managing collections, uploading files, and performing contextual searches. It is actively maintained with a focus on API integration and RAG pipeline development.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the Needle client, create a document collection, add a file via URL, wait for it to be indexed, and then perform a semantic search to retrieve relevant context. It highlights the use of `NEEDLE_API_KEY` for authentication and the typical workflow for building RAG components.

import os
from needle.v1 import NeedleClient
from needle.v1.models import FileToAdd
import time # For waiting during indexing

# Ensure your API key is set as an environment variable or pass it directly
# os.environ['NEEDLE_API_KEY'] = 'YOUR_API_KEY_HERE'
needle_api_key = os.environ.get('NEEDLE_API_KEY', '')
if not needle_api_key:
    print("Warning: NEEDLE_API_KEY environment variable not set. Client may fail to authenticate.")

# Initialize the client
# You can also pass the api_key directly: needle = NeedleClient(api_key=needle_api_key)
needle = NeedleClient()

# Create a new collection
collection_name = "My AI Project Docs"
print(f"Creating collection: {collection_name}")
collection = needle.collections.create(name=collection_name)
collection_id = collection.id
print(f"Collection '{collection.name}' created with ID: {collection_id}")

# Add files to the collection (e.g., from a URL)
file_url = "https://www.thoughtworks.com/content/dam/thoughtworks/documents/radar/2024/04/tr_technology_radar_vol_30_en.pdf"
file_name = "tech-radar-30.pdf"
print(f"Adding file '{file_name}' from {file_url} to collection {collection_id}")

files_to_add = [
    FileToAdd(name=file_name, url=file_url)
]
needle.collections.files.add(collection_id=collection_id, files=files_to_add)

# Wait for files to be indexed (indexing takes time)
print("Waiting for files to be indexed...")
for _ in range(10): # Poll for up to 50 seconds
    current_files = needle.collections.files.list(collection_id)
    if all(f.status == "indexed" for f in current_files):
        print("All files indexed successfully.")
        break
    time.sleep(5)
else:
    print("Warning: Not all files indexed within expected time.")

# Perform a semantic search
search_prompt = "What techniques moved into adopt in this volume of technology radar?"
print(f"Searching collection {collection_id} for prompt: '{search_prompt}'")
results = needle.collections.search(collection_id, text=search_prompt)

print("\nSearch Results (context snippets):")
for r in results:
    print(f"- {r.content[:100]}...")

# Clean up (optional) - delete the collection
# print(f"Deleting collection {collection_id}")
# needle.collections.delete(collection_id)
# print("Collection deleted.")

view raw JSON →