LlamaParse Reader for LlamaIndex

0.6.1 · active · verified Fri Apr 10

The `llama-index-readers-llama-parse` library provides a LlamaIndex reader that integrates with LlamaParse. It enables parsing of various complex file types (like PDFs, PPTs, etc.) into structured markdown, which can then be easily ingested and processed by LlamaIndex for RAG and other LLM applications. The current version is 0.6.1, and it's part of the broader LlamaIndex ecosystem, implying a regular release cadence with LlamaIndex.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the `LlamaParseReader` and load data from a local file. It highlights the mandatory `LLAMAPARSE_API_KEY` and shows how to access the parsed documents. Remember to replace 'path/to/your/document.pdf' with an actual file path.

import os
from llama_index.readers.llama_parse import LlamaParseReader

# Ensure you have your LlamaParse API key set as an environment variable
# os.environ["LLAMAPARSE_API_KEY"] = "your-api-key"
api_key = os.environ.get('LLAMAPARSE_API_KEY', '')

if not api_key:
    raise ValueError("LLAMAPARSE_API_KEY environment variable not set.")

# Initialize the LlamaParse reader
# For advanced options, see LlamaParseReader documentation (e.g., result_type='markdown')
parser = LlamaParseReader(api_key=api_key, verbose=True)

# Load data from a file (replace 'path/to/your/document.pdf' with an actual file)
# LlamaParse supports various file types like PDF, PPTX, DOCX, TXT, CSV, JSON, XML
# Note: This is an asynchronous operation and may take time to complete.
# The load_data method will poll LlamaParse until the parsing is complete.
try:
    documents = parser.load_data("path/to/your/document.pdf")
    print(f"Successfully parsed {len(documents)} document(s).")
    for doc in documents:
        print(f"Document ID: {doc.id_}")
        print(f"First 200 chars: {doc.text[:200]}...")
except Exception as e:
    print(f"Error parsing document: {e}")
    print("Make sure 'path/to/your/document.pdf' exists and your API key is valid.")

view raw JSON →