PaddleOCR

library 3.4.0 ·python

✓ verified May 21, 2026

PaddleOCR is an awesome multilingual OCR and document parsing toolkit built upon the PaddlePaddle deep learning framework. It provides robust capabilities for text detection, text recognition, and structured document understanding, capable of transforming images and PDFs into structured data (JSON/Markdown) for AI and LLM-based applications. The library is currently at version 3.4.0 and maintains a frequent release cadence, with minor versions released every few months, incorporating new models and features.

Traffic · last 30 days ↓44% vs prev 7d · indexed Sun Apr 12 · updated Wed May 27

total hits 18

actors 7 distinct systems

last hit 5d ago Script

Script

ChatGPT-User

GPTBot

ClaudeBot

Search engines

top countries 🇺🇸 United States · 🇫🇮 Finland · 🇩🇪 Germany · 🇫🇷 France · 🇮🇳 India

Resources

docsgithub.com/PaddlePaddle/PaddleOCR/blob/main/README.md ↗

githubgithub.com/PaddlePaddle/PaddleOCR ↗

packagepypi.org/project/paddleocr/ ↗

API endpoints

full doc /v1/registry/paddleocr

install /v1/registry/paddleocr/install

compatibility /v1/registry/paddleocr/compatibility