{"library":"pymupdf4llm","title":"PyMuPDF Utilities for LLM/RAG","description":"PyMuPDF4LLM (also aliased as `pdf4llm`) is a Python library built on PyMuPDF, specialized in converting PDF documents into clean, structured data formats like Markdown, JSON, and plain text, specifically optimized for Large Language Model (LLM) and Retrieval-Augmented Generation (RAG) environments. It includes layout analysis, automatic OCR for scanned pages, and supports multi-column layouts and image extraction. The library is actively maintained and frequently updated, with the current stable version being 1.27.2.2.","language":"python","status":"active","last_verified":"Wed May 20","install":{"commands":["pip install -U pymupdf4llm","pip install -U 'pymupdf4llm[ocr,layout]'"],"cli":{"name":"pdf4llm","version":"sh: 1: pdf4llm: not found"}},"imports":["import pymupdf4llm\nmd_text = pymupdf4llm.to_markdown(\"input.pdf\")","import pymupdf4llm\njson_text = pymupdf4llm.to_json(\"input.pdf\")","import pymupdf4llm\nplain_text = pymupdf4llm.to_text(\"input.pdf\")"],"auth":{"required":false,"env_vars":[]},"quickstart":{"code":"import pymupdf4llm\nimport pathlib\n\n# Assuming 'input.pdf' exists in the same directory\n# For real-world use, replace with a valid path or PyMuPDF Document object\ninput_pdf_path = \"example.pdf\" \n\n# Create a dummy PDF for demonstration if it doesn't exist\n# In a real scenario, you would have your actual PDF file\ntry:\n    import fitz # PyMuPDF\n    doc = fitz.open()\n    page = doc.new_page()\n    page.insert_text((72, 72), \"# Hello, PyMuPDF4LLM!\\n\\nThis is a sample PDF content.\\n\\n- Item 1\\n- Item 2\\n\\n| Header 1 | Header 2 |\\n|----------|----------|\\n| Data 1   | Data 2   |\", fontsize=12)\n    doc.save(input_pdf_path)\n    doc.close()\nexcept ImportError:\n    print(\"PyMuPDF not installed, cannot create dummy PDF. Please provide a real PDF.\")\n    input_pdf_path = None\n\nif input_pdf_path and pathlib.Path(input_pdf_path).exists():\n    # Convert the PDF content to Markdown\n    md_text = pymupdf4llm.to_markdown(input_pdf_path)\n\n    # Print the converted markdown content\n    print(\"\\n--- Markdown Output ---\")\n    print(md_text)\n\n    # Optionally, write it to a markdown file\n    output_md_path = pathlib.Path(\"output.md\")\n    output_md_path.write_text(md_text, encoding=\"utf-8\")\n    print(f\"\\nMarkdown saved to {output_md_path.absolute()}\")\nelse:\n    print(\"Skipping quickstart as no PDF file is available.\")","lang":"python","description":"This quickstart demonstrates how to convert a PDF document into Markdown format using `pymupdf4llm.to_markdown()`. It also shows how to save the output to a file. The library can also convert to JSON and plain text using `to_json()` and `to_text()` respectively.","tag":null,"tag_description":null,"last_tested":"2026-04-24","results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":0},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":0},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":0},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":0},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":0}]},"compatibility":{"tag":"draft","tag_description":"notable install failures or slow imports","last_tested":"2026-05-20","installed_version":"0.0.27","pypi_latest":"1.27.2.3","is_stale":true,"summary":{"python_range":"3.10–3.9","success_rate":75,"avg_install_s":7.4,"avg_import_s":2.2,"wheel_type":"wheel"},"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"79.2M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"-U","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"79.3M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"-U","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":11.9,"import_time_s":1.02,"mem_mb":37.1,"disk_size":"298M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":1.17,"mem_mb":37.1,"disk_size":"298M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":11.8,"import_time_s":1.01,"mem_mb":37.1,"disk_size":"298M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":1.11,"mem_mb":37.1,"disk_size":"298M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"82.7M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"-U","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"82.8M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"-U","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":7.5,"import_time_s":3.67,"mem_mb":39.4,"disk_size":"255M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":4.11,"mem_mb":39.4,"disk_size":"254M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":7.4,"import_time_s":3.75,"mem_mb":39.4,"disk_size":"255M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":4,"mem_mb":39.4,"disk_size":"254M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"74.3M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"-U","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"74.5M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"-U","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":7.4,"import_time_s":2.73,"mem_mb":37.3,"disk_size":"242M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":3.08,"mem_mb":37.3,"disk_size":"241M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":7.3,"import_time_s":2.77,"mem_mb":37.3,"disk_size":"242M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":3.21,"mem_mb":37.3,"disk_size":"241M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"74.0M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"-U","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"74.2M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"-U","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":7.3,"import_time_s":2.56,"mem_mb":38.1,"disk_size":"241M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":3.15,"mem_mb":38,"disk_size":"240M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":7.3,"import_time_s":2.6,"mem_mb":38.1,"disk_size":"241M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":3.04,"mem_mb":38.1,"disk_size":"240M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"76.6M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"-U","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"76.6M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"-U","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"noisy","install_time_s":3.2,"import_time_s":0.24,"mem_mb":16.7,"disk_size":"76M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.25,"mem_mb":16.7,"disk_size":"76M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"noisy","install_time_s":3.3,"import_time_s":0.25,"mem_mb":16.7,"disk_size":"76M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"-U","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.26,"mem_mb":16.7,"disk_size":"76M"}]}}