PDFQuery

library 0.4.3 ·python maintenance

✓ verified May 1, 2026

PDFQuery is a lightweight Python library for scraping data from PDFs using JQuery-like CSS selectors or XPath expressions. It wraps pdfminer and lxml to provide a concise API for extracting text, tables, and layouts. Version 0.4.3 is the latest, with no active development since 2016.

Traffic · last 30 days ↓67% vs prev 7d · indexed Fri May 01 · updated Mon Jun 15

total hits 16

actors 5 distinct systems

last hit 4d ago AhrefsBot

MetaBot

GPTBot

ClaudeBot

Humans

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇫🇷 France · 🇳🇴 Norway · 🇮🇩 Indonesia

Resources

githubgithub.com/jcushman/pdfquery ↗

packagepypi.org/project/pdfquery/ ↗

API endpoints

full doc /v1/registry/pdfquery

install /v1/registry/pdfquery/install