PDFText

library 0.6.3 ·python

✓ verified May 24, 2026

pdftext is a Python library designed for fast and accurate extraction of structured text from PDF documents. It focuses on efficiently parsing text, detecting elements like tables and links, and handling complex layouts. The current version is 0.6.3, and it's actively maintained with frequent minor releases addressing bug fixes and introducing new features.

Traffic · last 30 days ↑22% vs prev 7d · indexed Thu Apr 16 · updated Mon Jun 01

total hits 25

actors 8 distinct systems

last hit 1d ago ByteDance

ByteDance

MetaBot

Script

GPTBot

ClaudeBot

Search engines

top countries 🇺🇸 United States · 🇸🇬 Singapore · 🇫🇷 France · 🇩🇪 Germany · 🇨🇦 Canada

Resources

githubgithub.com/VikParuchuri/pdftext ↗

packagepypi.org/project/pdftext/ ↗

API endpoints

full doc /v1/registry/pdftext

install /v1/registry/pdftext/install

compatibility /v1/registry/pdftext/compatibility