pdf2docx

library 0.5.12 ·python maintenance

✓ verified May 22, 2026

pdf2docx is an open-source Python library designed for converting PDF files into editable Microsoft Word DOCX documents. It leverages PyMuPDF for PDF data extraction, applies rule-based parsing for layout analysis, and utilizes python-docx for generating the final DOCX output. The library aims to extract text, images, and tables while preserving the original layout and formatting. The current version is 0.5.12, released on March 9, 2026.

Traffic · last 30 days ↑117% vs prev 7d · indexed Sun Apr 12 · updated Wed May 27

total hits 22

actors 8 distinct systems

last hit 1d ago ByteDance

ByteDance

Script

GPTBot

ClaudeBot

Google-Other

Search engines

top countries 🇺🇸 United States · 🇫🇷 France · 🇸🇬 Singapore · 🇬🇧 United Kingdom · 🇩🇪 Germany

Resources

packagepypi.org/project/pdf2docx/ ↗

homepageartifex.com/ ↗

API endpoints

full doc /v1/registry/pdf2docx

install /v1/registry/pdf2docx/install

compatibility /v1/registry/pdf2docx/compatibility