micawber
micawber is a small Python library for extracting rich content (like videos, images, or summaries) from URLs using the oEmbed specification. It includes a set of pre-configured providers for common services like YouTube, Vimeo, and Flickr. The latest version is 0.6.2, released in 2017, and it appears to be in maintenance mode, stable but not actively developed.
Common errors
-
No provider found for URL: http://example.com/some-unsupported-url
cause The URL provided does not match any of `micawber`'s configured OEmbed providers, or the provider's endpoint has changed since `micawber`'s last update.fixVerify if the URL's service supports OEmbed. If it does, check the official OEmbed endpoint and consider manually adding a custom provider pattern to your `micawber.Micawber` instance. -
KeyError: 'title' (or 'html', 'thumbnail_url', etc.)
cause The OEmbed response received from the provider was missing the expected key, indicating an incomplete or malformed response, or the content could not be retrieved by the provider.fixAccess dictionary keys using `.get('key', None)` or `.get('key', 'Default Value')` to prevent `KeyError`. Print the full `response` dictionary to debug what data was actually received. -
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))cause A network issue (e.g., DNS resolution failure, firewall, server unavailability, or client-side connection reset) occurred while `micawber` was attempting to fetch content from an OEmbed provider.fixVerify your internet connectivity and the status of the OEmbed provider's server. Implement retry logic with exponential backoff for network requests to handle transient connection issues.
Warnings
- gotcha The built-in OEmbed provider list has not been updated since the library's last release in 2017. New services or changes to existing service endpoints may cause `micawber` to fail to retrieve content or retrieve incorrect data for certain URLs.
- gotcha `micawber` does not configure a cache by default when using `micawber.bootstrap_basic()` or `micawber.Micawber()` without a `cache` argument. This can lead to repeated network requests for the same URL, impacting performance and potentially causing rate-limiting issues with external APIs.
- gotcha If an OEmbed provider returns an error, an incomplete response, or malformed data, `micawber` might raise a `KeyError` when accessing expected fields (e.g., 'title', 'html') or return an incomplete dictionary without explicit error status.
Install
-
pip install micawber
Imports
- bootstrap_basic
from micawber import bootstrap_basic
- Micawber
from micawber import Micawber
- bootstrap_basic
from micawber import bootstrap_basic (not strictly wrong, but the top-level import is more common for convenience)
from micawber.providers import bootstrap_basic
Quickstart
import micawber
from micawber.cache import Cache
# Configure Micawber with default providers and an in-memory cache
m = micawber.bootstrap_basic(cache=Cache())
# Example URL (using a real OEmbed example, this is usually a video)
youtube_url = 'https://www.youtube.com/watch?v=dQw4w9WgXcQ'
# Request OEmbed data for a URL
try:
response = m.request(youtube_url)
print(f"Title: {response.get('title')}")
print(f"Type: {response.get('type')}")
print(f"HTML: {response.get('html', 'No HTML provided')[:50]}...")
except Exception as e:
print(f"Error requesting OEmbed data: {e}")
# Alternatively, parse text containing URLs and replace them with rich content
text_with_url = f'Check out this video: {youtube_url} and some other stuff.'
html_output = m.parse_text(text_with_url)
print("\n--- Parsed Text HTML Output ---")
print(html_output[:200] + '...') # Print first 200 chars for brevity