{"id":2577,"library":"llama-index-readers-file","title":"LlamaIndex File Readers","description":"The `llama-index-readers-file` library provides specialized data loaders for various local file formats (e.g., PDF, DOCX, CSV, TXT, Image) within the LlamaIndex ecosystem. It allows users to ingest different file types into LlamaIndex Document objects for indexing and retrieval. Current version is 0.6.0, with releases typically aligning with LlamaIndex core library updates.","status":"active","version":"0.6.0","language":"en","source_language":"en","source_url":"https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/readers/file","tags":["llama-index","llm","reader","data-loading","file-parsing","pdf","docx","csv"],"install":[{"cmd":"pip install llama-index-readers-file","lang":"bash","label":"Basic Install"},{"cmd":"pip install llama-index-readers-file[pdf,docx,xlsx]","lang":"bash","label":"Install with common extras"}],"dependencies":[{"reason":"Required base library for LlamaIndex components.","package":"llama-index-core","optional":false},{"reason":"Required for PDFReader functionality.","package":"pypdf","optional":true},{"reason":"Required for DocxReader functionality.","package":"docx2txt","optional":true},{"reason":"Required for CSVFileReader and ExcelReader functionality.","package":"pandas","optional":true},{"reason":"Required for ImageReader, HTMLReader, and advanced file parsing.","package":"unstructured","optional":true}],"imports":[{"note":"For loading plain text files.","symbol":"FlatReader","correct":"from llama_index.readers.file import FlatReader"},{"note":"For loading PDF documents. Requires 'pypdf' extra.","symbol":"PDFReader","correct":"from llama_index.readers.file import PDFReader"},{"note":"For loading DOCX documents. Requires 'docx2txt' extra.","symbol":"DocxReader","correct":"from llama_index.readers.file import DocxReader"},{"note":"For loading CSV files. Requires 'pandas' extra.","symbol":"CSVFileReader","correct":"from llama_index.readers.file import CSVFileReader"}],"quickstart":{"code":"import tempfile\nfrom pathlib import Path\nfrom llama_index.readers.file import FlatReader\n\n# Create a dummy text file\nfile_content = \"This is a sample document for LlamaIndex. It contains some text.\"\nwith tempfile.NamedTemporaryFile(mode='w', delete=False, suffix='.txt') as tmp_file:\n    tmp_file.write(file_content)\n    tmp_file_path = Path(tmp_file.name)\n\n# Initialize the FlatReader\nreader = FlatReader()\n\n# Load data from the temporary file\ndocuments = reader.load_data(file=tmp_file_path)\n\n# Print the content of the first document\nif documents:\n    print(f\"Loaded document content: {documents[0].text[:100]}...\")\n    print(f\"Metadata: {documents[0].metadata}\")\n\n# Clean up the temporary file\ntmp_file_path.unlink()","lang":"python","description":"This quickstart demonstrates how to use the `FlatReader` from `llama-index-readers-file` to load a plain text document. It creates a temporary file, loads its content into LlamaIndex Document objects, and prints a snippet of the loaded text. Remember to install the `llama-index-readers-file` package."},"warnings":[{"fix":"Ensure `pip install llama-index-readers-file` is executed. Update import paths from `llama_index.readers.*` to `llama_index.readers.file.*`.","message":"Prior to LlamaIndex v0.10.x, some file readers might have been directly available within the main `llama_index` package. With the modularization, specific readers now reside in sub-packages like `llama-index-readers-file` and require a separate installation.","severity":"breaking","affected_versions":"Pre-0.10.x to 0.10.x and later"},{"fix":"Install the required extras, e.g., `pip install llama-index-readers-file[pdf]` for PDF support, or `pip install pypdf` manually. Check the `pyproject.toml` or documentation for a full list of extras.","message":"Many specific file type readers (e.g., PDFReader, DocxReader, CSVFileReader) rely on additional third-party libraries that are not installed by default with `llama-index-readers-file`. You must install these optional dependencies explicitly.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For basic text extraction, `FlatReader` from `llama-index-readers-file` is sufficient. For rich content, install `pip install llama-index-readers-unstructured` and use `UnstructuredReader`.","message":"When dealing with complex documents (e.g., PDFs with tables, scanned images, or nested structures), `FlatReader` or basic type-specific readers may not extract content optimally. For advanced parsing, consider `UnstructuredReader` (available in `llama-index-readers-unstructured`).","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}