news-please: News Crawler and Extractor
JSON →news-please is an open-source, easy-to-use Python library designed for crawling news websites and extracting structured information from articles. It can recursively follow internal hyperlinks and read RSS feeds to fetch both recent and archived articles. The library also provides an API for programmatic use within Python applications and supports extracting articles from the commoncrawl.org news archive. It is currently active, with version 1.6.16 released, and maintains a regular release cadence.
Traffic · last 30 days ↑50% vs prev 7d
total hits 17
actors 7 distinct systems
last hit 3d ago MetaBot
top countries 🇺🇸 United States · 🇩🇪 Germany · 🇫🇷 France · 🇨🇦 Canada · 🇫🇮 Finland
API endpoints
full doc /v1/registry/news-please
install /v1/registry/news-please/install
compatibility /v1/registry/news-please/compatibility