PDFQuery

JSON →
library 0.4.3 ·python maintenance
verified May 1, 2026

PDFQuery is a lightweight Python library for scraping data from PDFs using JQuery-like CSS selectors or XPath expressions. It wraps pdfminer and lxml to provide a concise API for extracting text, tables, and layouts. Version 0.4.3 is the latest, with no active development since 2016.

total hits 16
actors 5 distinct systems
last hit 4d ago AhrefsBot
MetaBot
3
GPTBot
2
ClaudeBot
1
Humans
2

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇫🇷 France · 🇳🇴 Norway · 🇮🇩 Indonesia