OCRmyPDF

JSON →
library 17.4.1 ·python
verified May 22, 2026

OCRmyPDF is a Python library and application that adds an invisible OCR text layer to scanned PDF files, making them searchable. It utilizes the Tesseract OCR engine and other external tools to process documents, capable of producing highly optimized and archived-ready (PDF/A) files. The project is actively maintained with frequent updates, typically seeing major version releases annually and minor/patch releases more often.

total hits 31
actors 10 distinct systems
last hit 20h ago ByteDance
ChatGPT-User
14
Script
3
ByteDance
2
OAI-SearchBot
2
ClaudeBot
1
Search engines
2
Humans
2

top countries 🇺🇸 United States · 🇰🇷 South Korea · 🇨🇦 Canada · 🇩🇪 Germany · 🇯🇵 Japan