PDFText

JSON →
library 0.6.3 ·python
verified May 24, 2026

pdftext is a Python library designed for fast and accurate extraction of structured text from PDF documents. It focuses on efficiently parsing text, detecting elements like tables and links, and handling complex layouts. The current version is 0.6.3, and it's actively maintained with frequent minor releases addressing bug fixes and introducing new features.

total hits 25
actors 8 distinct systems
last hit 1d ago ByteDance
ByteDance
6
MetaBot
4
Script
3
GPTBot
2
ClaudeBot
1
Search engines
1

top countries 🇺🇸 United States · 🇸🇬 Singapore · 🇫🇷 France · 🇩🇪 Germany · 🇨🇦 Canada