Newspaper3k

library 0.2.8 ·python maintenance

✓ verified Jun 28, 2026

Newspaper3k is a Python 3 library designed for simplified article discovery, extraction, and natural language processing (NLP) from news websites. It excels at extracting main content, metadata like title, author, publish date, images, and videos, as well as generating keywords and summaries. Although its last PyPI release was in 2018, it remains functional for many use cases, though a community fork (`newspaper4k`) provides more active development and modern features.

Traffic · last 30 days ↑233% vs prev 7d · indexed Sun Apr 12 · updated Sat Jul 11

total hits 23

actors 7 distinct systems

last hit 5d ago AhrefsBot

GPTBot

OAI-SearchBot

Script

ChatGPT-User

Search engines

Humans

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇦🇺 Australia · 🇨🇳 China · 🇩🇪 Germany

Resources

githubgithub.com/codelucas/newspaper ↗

packagepypi.org/project/newspaper3k/ ↗

homepagenewspaper3k.github.io/newspaper3k/ ↗

API endpoints

full doc /v1/registry/newspaper3k

install /v1/registry/newspaper3k/install

compatibility /v1/registry/newspaper3k/compatibility