Unstructured Ingest
JSON →Unstructured Ingest is a Python library that provides local ETL data pipelines to prepare diverse unstructured data (e.g., PDFs, HTML, Word docs) for RAG (Retrieval Augmented Generation) and other AI/LLM applications. It supports various source and destination connectors, enabling batch processing, partitioning, chunking, and embedding of documents. The current version is 1.4.24, and it sees frequent updates with ongoing development and new connector integrations.
Traffic · last 30 days ↑33% vs prev 7d
total hits 18
actors 5 distinct systems
last hit 1d ago AhrefsBot
top countries 🇺🇸 United States · 🇨🇦 Canada · 🇫🇷 France · 🇩🇪 Germany · 🇮🇳 India
API endpoints
full doc /v1/registry/unstructured-ingest
compatibility /v1/registry/unstructured-ingest/compatibility