Feedparser Library
Feedparser is a Python library for downloading and parsing syndicated feeds, including RSS (0.9x, 1.0, 2.0), Atom (0.3, 1.0), CDF, and JSON feeds. It aims to normalize various feed types and versions into a consistent Python dictionary structure, simplifying feed processing. The current stable version is 6.0.12, and the project maintains an active release cadence with regular updates and fixes.
Warnings
- breaking The `sanitize_html` and `resolve_relative_uris` flags, which were global module attributes in `feedparser` 5.x, must now be passed directly as arguments to the `feedparser.parse()` function in version 6.x.
- gotcha Older versions of `feedparser` 6.x (prior to 6.0.12) could crash with an `AssertionError` on Python 3.10+ when encountering malformed CDATA sections in feeds.
- deprecated `feedparser` 6.0.10 and earlier versions relied on Python's deprecated `cgi` module, which is slated for removal in Python 3.13. This could lead to `DeprecationWarning` messages on newer Python interpreters.
- gotcha The internal HTTP fetching mechanism of `feedparser` (which uses `urllib` by default) does not include a built-in timeout, potentially causing applications to hang indefinitely when a feed server is unresponsive.
- breaking `feedparser` version 6.x officially dropped support for Python 2. Early 6.0.x releases had issues where `pip` might incorrectly install them on Python 2 due to incorrect wheel metadata.
Install
-
pip install feedparser
Imports
- feedparser
import feedparser
Quickstart
import feedparser
import os
# Replace with a real RSS/Atom feed URL
feed_url = os.environ.get('FEED_URL', 'http://feedparser.org/docs/examples/atom10.xml')
d = feedparser.parse(feed_url)
print(f"Feed Title: {d.feed.title}")
if d.entries:
first_entry = d.entries[0]
print(f"First Entry Title: {first_entry.title}")
print(f"First Entry Link: {first_entry.link}")
if hasattr(first_entry, 'published_parsed'):
print(f"First Entry Published: {first_entry.published_parsed}")