Needle Python Client Library
needle-python is the official client library for the Needle API. It simplifies the process of building Retrieval-Augmented Generation (RAG) pipelines, enabling semantic search and efficient contextualization for Large Language Models (LLMs). The library, currently at version 0.6.0, provides tools for managing collections, uploading files, and performing contextual searches. It is actively maintained with a focus on API integration and RAG pipeline development.
Common errors
-
ModuleNotFoundError: No module named 'needle.v1'
cause The `needle-python` package is either not installed, or an incorrect 'needle' package (not the API client) is installed. Or, the import path is incorrect.fixEnsure `needle-python` is installed using `pip install needle-python`. Confirm your import statement is `from needle.v1 import NeedleClient`. -
needle.v1.models.Error: Authentication Failed
cause The Needle API key (`NEEDLE_API_KEY`) is missing or invalid, preventing the client from authenticating with the API.fixSet the `NEEDLE_API_KEY` environment variable with a valid key obtained from your Needle settings, or pass the key directly to `NeedleClient(api_key='YOUR_KEY')`. -
AttributeError: 'NoneType' object has no attribute 'id'
cause This typically occurs if `needle.collections.create()` or other API calls fail to return a valid object (e.g., due to an API error, network issue, or invalid parameters), and `None` is returned instead of an object with an `id` attribute.fixAdd error handling and check the return value of API calls before accessing its attributes. For example: `collection = needle.collections.create(...)` then `if collection: collection_id = collection.id`.
Warnings
- gotcha There are several unrelated Python libraries also named 'needle' (e.g., for visual testing, deep learning, or threading). Ensure you install `needle-python` and import from `needle.v1` to use the Needle API client library.
- breaking The client requires an API key for authentication. If `NEEDLE_API_KEY` is not set as an environment variable, API calls will fail with authentication errors.
- gotcha After adding files to a collection, Needle processes and indexes them asynchronously. Direct search queries immediately after `add` might not return results until indexing is complete.
- gotcha The import path `from needle.v1 import ...` indicates API versioning. Future major API changes might introduce a `v2` or alter the structure under `v1`, potentially requiring updates to import statements and method calls.
Install
-
pip install needle-python
Imports
- NeedleClient
from needle import NeedleClient
from needle.v1 import NeedleClient
- FileToAdd
from needle import FileToAdd
from needle.v1.models import FileToAdd
Quickstart
import os
from needle.v1 import NeedleClient
from needle.v1.models import FileToAdd
import time # For waiting during indexing
# Ensure your API key is set as an environment variable or pass it directly
# os.environ['NEEDLE_API_KEY'] = 'YOUR_API_KEY_HERE'
needle_api_key = os.environ.get('NEEDLE_API_KEY', '')
if not needle_api_key:
print("Warning: NEEDLE_API_KEY environment variable not set. Client may fail to authenticate.")
# Initialize the client
# You can also pass the api_key directly: needle = NeedleClient(api_key=needle_api_key)
needle = NeedleClient()
# Create a new collection
collection_name = "My AI Project Docs"
print(f"Creating collection: {collection_name}")
collection = needle.collections.create(name=collection_name)
collection_id = collection.id
print(f"Collection '{collection.name}' created with ID: {collection_id}")
# Add files to the collection (e.g., from a URL)
file_url = "https://www.thoughtworks.com/content/dam/thoughtworks/documents/radar/2024/04/tr_technology_radar_vol_30_en.pdf"
file_name = "tech-radar-30.pdf"
print(f"Adding file '{file_name}' from {file_url} to collection {collection_id}")
files_to_add = [
FileToAdd(name=file_name, url=file_url)
]
needle.collections.files.add(collection_id=collection_id, files=files_to_add)
# Wait for files to be indexed (indexing takes time)
print("Waiting for files to be indexed...")
for _ in range(10): # Poll for up to 50 seconds
current_files = needle.collections.files.list(collection_id)
if all(f.status == "indexed" for f in current_files):
print("All files indexed successfully.")
break
time.sleep(5)
else:
print("Warning: Not all files indexed within expected time.")
# Perform a semantic search
search_prompt = "What techniques moved into adopt in this volume of technology radar?"
print(f"Searching collection {collection_id} for prompt: '{search_prompt}'")
results = needle.collections.search(collection_id, text=search_prompt)
print("\nSearch Results (context snippets):")
for r in results:
print(f"- {r.content[:100]}...")
# Clean up (optional) - delete the collection
# print(f"Deleting collection {collection_id}")
# needle.collections.delete(collection_id)
# print("Collection deleted.")