magic-pdf
raw JSON → 1.3.12 verified Sat May 09 auth: no python
A practical tool for converting PDF to Markdown, part of the MinerU project by OpenDataLab. Current version is 1.3.12, requires Python >=3.10, <3.14. The package is actively maintained with frequent releases.
pip install magic-pdf Common errors
error ModuleNotFoundError: No module named 'magic_pdf' ↓
cause Package not installed or imported with wrong name (hyphen instead of underscore).
fix
Run 'pip install magic-pdf' and use 'import magic_pdf'.
error AttributeError: module 'magic_pdf' has no attribute 'parse' ↓
cause Using deprecated function name 'parse' after v1.3.0.
fix
Use 'magic_pdf.parse_pdf()' instead.
error KeyError: 'markdown' ↓
cause Accessing 'markdown' key on result from older version (<1.3.0).
fix
Check version: if <1.3.0 use result['text'], else use result['markdown'].
Warnings
gotcha The package name on PyPI is 'magic-pdf', but the import uses underscore: 'magic_pdf'. ↓
fix Use 'pip install magic-pdf' and 'import magic_pdf'.
breaking Version 1.3.0 changed the output structure: the Markdown content is now under 'markdown' key instead of 'text'. ↓
fix Access result['markdown'] for v1.3.0+, or result['text'] for older versions.
deprecated The function 'magic_pdf.parse' was deprecated in v1.3.0 in favor of 'magic_pdf.parse_pdf'. ↓
fix Use 'parse_pdf' instead of 'parse'.
Imports
- magic_pdf wrong
import magic_pdf as pdfcorrectimport magic_pdf - parse_pdf wrong
from magic_pdf.pipeline import parse_pdfcorrectfrom magic_pdf import parse_pdf
Quickstart
import magic_pdf
result = magic_pdf.parse_pdf('sample.pdf', output_dir='./output')
print(result['markdown'])