LlamaIndex

0.14.20 · active · verified Thu Apr 09

LlamaIndex is a data framework for LLM applications, providing tools to ingest, structure, and access private or domain-specific data with large language models. It facilitates building RAG (Retrieval Augmented Generation) applications, agents, and more. The current version is 0.14.20, with a rapid release cadence that often includes new integrations and improvements across its modular ecosystem.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to load a document, create a vector index, and query it using the default OpenAI LLM and embedding models. It highlights the use of the `Settings` object for configuration, which replaced `ServiceContext` in LlamaIndex 0.10.0+.

import os
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

# Ensure you have OPENAI_API_KEY set in your environment variables
# For quick testing, a dummy key is used, but a real key is needed for actual API calls.
os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY", "sk-DUMMY")

# Create a dummy data directory and file for the example
if not os.path.exists("data"): 
    os.makedirs("data")
with open("data/sample_doc.txt", "w") as f:
    f.write("LlamaIndex is a data framework for building LLM applications.")
    f.write("It helps connect custom data sources to large language models.")

try:
    # Configure the global Settings object (replaces ServiceContext)
    Settings.llm = OpenAI(model="gpt-3.5-turbo")
    Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
    Settings.chunk_size = 1024

    # 1. Load documents from a directory
    documents = SimpleDirectoryReader("data").load_data()

    # 2. Create an index from the documents
    index = VectorStoreIndex.from_documents(documents)

    # 3. Create a query engine and query the index
    query_engine = index.as_query_engine()
    response = query_engine.query("What is LlamaIndex?")

    print(f"Query: What is LlamaIndex?")
    print(f"Response: {response.response}")

except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure you have `OPENAI_API_KEY` set and `llama-index-llms-openai` and `llama-index-embeddings-openai` installed.")
finally:
    # Clean up dummy file and directory
    if os.path.exists("data/sample_doc.txt"):
        os.remove("data/sample_doc.txt")
    if os.path.exists("data"):
        os.rmdir("data")

view raw JSON →