MinerU PDF to Markdown Converter

library 3.0.9 ·python

✓ verified May 26, 2026

MinerU is a robust document parsing tool designed to convert various input formats, including PDF, images, DOCX, PPTX, and XLSX, into machine-readable Markdown and JSON. It is optimized for downstream retrieval, extraction, and processing, especially for LLM-ready formats. Currently at version 3.0.9, the library is actively maintained with ongoing architectural enhancements and feature improvements, particularly in handling scientific literature and complex document structures.