Aurelio Platform SDK
The Aurelio Platform SDK is a Python library (version 0.0.19) that simplifies interaction with the Aurelio Platform for document processing tasks. It enables developers to extract text from various sources (PDFs, URLs), intelligently chunk content, and generate embeddings. The library focuses on abstracting the complexities of AI-powered document pipelines, offering both synchronous and asynchronous clients. Its release cadence appears to be active with regular updates and feature enhancements.
Common errors
-
ValueError: AURELIO_API_KEY environment variable not set.
cause The `AurelioClient` constructor was called without a valid API key. The key must be passed directly or found in the `AURELIO_API_KEY` environment variable.fixSet the environment variable `export AURELIO_API_KEY="your_api_key_here"` or pass the key directly: `client = AurelioClient(api_key="your_api_key_here")`. -
ModuleNotFoundError: No module named 'aurelio_sdk'
cause The `aurelio-sdk` package is not installed in the current Python environment or the environment is not activated.fixInstall the package using pip: `pip install aurelio-sdk`. Ensure your virtual environment is activated if you are using one. -
ValidationError: 'quality' is not a valid enumeration member; permitted: 'aurelio-base', 'docling-base', 'gemini-2-flash-lite'
cause An older, deprecated 'quality' parameter value (e.g., 'high' or 'low') was used when calling an extraction method, instead of the new explicit model names.fixUpdate your code to use the new model names, e.g., `model='aurelio-base'` or `model='docling-base'` instead of `quality='low'` or `quality='high'`.
Warnings
- deprecated The `quality` parameter (e.g., 'low', 'high') for document extraction models is deprecated. Use specific model names like `aurelio-base`, `docling-base`, or `gemini-2-flash-lite` instead.
- gotcha For large files or potentially long-running extraction requests, it's crucial to enable polling and implement robust error handling. If `wait` is not sufficiently long or polling is not configured, the client might return before the process completes.
- gotcha The SDK offers both synchronous (`AurelioClient`) and asynchronous (`AsyncAurelioClient`) APIs. Choosing the right client is important for performance, especially in I/O-bound applications or when processing multiple documents concurrently.
Install
-
pip install aurelio-sdk
Imports
- AurelioClient
from aurelio_sdk.client import AurelioClient
from aurelio_sdk import AurelioClient
- AsyncAurelioClient
from aurelio_sdk import AsyncAurelioClient
Quickstart
import os
from aurelio_sdk import AurelioClient
from dotenv import load_dotenv
# Load environment variables from a .env file (optional, but good practice for API keys)
load_dotenv()
# Ensure your API key is set as an environment variable or passed directly
api_key = os.environ.get("AURELIO_API_KEY")
if not api_key:
raise ValueError("AURELIO_API_KEY environment variable not set.")
client = AurelioClient(api_key=api_key)
# Example: Extract text from a URL
# For a real file, replace with client.extract_file(file_path="your_document.pdf")
try:
print("Attempting to extract text from a URL...")
response = client.extract_url(
url="https://www.aurelio.ai/blog/building-with-openai-agents-sdk",
model="aurelio-base", # Use the new model names
wait=60 # Wait up to 60 seconds for completion
)
if response.status == "completed":
print(f"Extraction Status: {response.status}")
print(f"Extracted Document ID: {response.document.id}")
if response.chunks:
print("First chunk of extracted text:")
print(response.chunks[0].text[:500] + "...") # Print first 500 chars
else:
print("No chunks extracted.")
else:
print(f"Extraction did not complete. Status: {response.status}. Message: {response.message}")
except Exception as e:
print(f"An error occurred during extraction: {e}")