news-please: News Crawler and Extractor

library 1.6.16 ·python

✓ verified May 25, 2026

news-please is an open-source, easy-to-use Python library designed for crawling news websites and extracting structured information from articles. It can recursively follow internal hyperlinks and read RSS feeds to fetch both recent and archived articles. The library also provides an API for programmatic use within Python applications and supports extracting articles from the commoncrawl.org news archive. It is currently active, with version 1.6.16 released, and maintains a regular release cadence.

Traffic · last 30 days ↑50% vs prev 7d · indexed Thu Apr 16 · updated Mon Jun 01

total hits 17

actors 7 distinct systems

last hit 3d ago MetaBot

MetaBot

Script

GPTBot

ClaudeBot

Search engines

top countries 🇺🇸 United States · 🇩🇪 Germany · 🇫🇷 France · 🇨🇦 Canada · 🇫🇮 Finland

Resources

githubgithub.com/fhamborg/news-please ↗

packagepypi.org/project/news-please/ ↗

API endpoints

full doc /v1/registry/news-please

install /v1/registry/news-please/install

compatibility /v1/registry/news-please/compatibility