LlamaIndex Core
LlamaIndex Core provides the foundational interface and components for building LLM-powered applications, enabling users to connect large language models with their private or domain-specific data. It includes data structures, indexing tools, query engines, and basic abstractions for LLMs and embedding models. The current version is 0.14.20, with frequent, often daily, releases across its modular ecosystem.
Warnings
- breaking Major architectural shift to a modular package structure in versions ~0.10.x and onwards. Core functionalities moved to `llama-index-core`, and all LLM, embedding, vector store, etc., integrations became separate packages (e.g., `llama-index-llms-openai`, `llama-index-embeddings-openai`).
- breaking The `ServiceContext` class was deprecated and largely replaced by the global `Settings` object for configuration. While `ServiceContext` might still exist in some forms, `Settings` is the recommended way to configure LLMs, embedding models, chunk sizes, etc.
- deprecated Support for Python 3.9 has been officially deprecated and removed.
- gotcha Many LlamaIndex components (LLMs, embeddings, vector stores, data loaders) default to using `openai` if not explicitly configured. This often leads to `openai` being a de-facto dependency for basic usage, requiring an API key even if not intended.
- gotcha When migrating from older versions, `Document` and `Node` structures might have subtle differences in metadata handling and content fields. For instance, `text` vs `content` or `extra_info` vs `metadata`.
Install
-
pip install llama-index-core -
pip install 'llama-index-llms-openai' 'llama-index-embeddings-openai'
Imports
- VectorStoreIndex
from llama_index.core import VectorStoreIndex
- SimpleDirectoryReader
from llama_index.core.readers import SimpleDirectoryReader
- Settings
from llama_index.core import Settings
- OpenAI
from llama_index.llms.openai import OpenAI
- OpenAIEmbedding
from llama_index.embeddings.openai import OpenAIEmbedding
Quickstart
import os
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
# Ensure you have your OpenAI API key set as an environment variable
# os.environ["OPENAI_API_KEY"] = "sk-..."
OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY', '')
if not OPENAI_API_KEY:
raise ValueError("OPENAI_API_KEY environment variable not set.")
# Create a dummy data directory and file
if not os.path.exists("data"):
os.makedirs("data")
with open("data/hello.txt", "w") as f:
f.write("The quick brown fox jumps over the lazy dog.\n")
f.write("LlamaIndex is a data framework for LLM applications.")
# 1. Load data
documents = SimpleDirectoryReader("data").load_data()
# 2. Configure global settings (LLM and Embedding Model)
Settings.llm = OpenAI(api_key=OPENAI_API_KEY, model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(api_key=OPENAI_API_KEY, model="text-embedding-ada-002")
# 3. Create an index
index = VectorStoreIndex.from_documents(documents)
# 4. Create a query engine
query_engine = index.as_query_engine()
# 5. Query the index
response = query_engine.query("What is LlamaIndex?")
print(response.response)