justext

library 3.0.2 ·python

✓ verified May 20, 2026

justext is a heuristic-based boilerplate removal tool for HTML documents. It extracts the main content from web pages, discarding navigation, advertisements, and other extraneous elements. The current version is 3.0.2, and it typically releases updates for bug fixes and compatibility issues.

Traffic · last 30 days ↓67% vs prev 7d · indexed Thu Apr 09 · updated Mon May 25

total hits 11

actors 6 distinct systems

last hit 1d ago Googlebot

Script

3

GPTBot

2

Search engines

2

top countries 🇺🇸 United States · 🇫🇷 France · 🇩🇪 Germany · 🇮🇳 India

Resources

githubgithub.com/miso-belica/jusText ↗

packagepypi.org/project/justext/ ↗

API endpoints

full doc /v1/registry/justext

install /v1/registry/justext/install

compatibility /v1/registry/justext/compatibility