Sickle
raw JSON → 0.7.0 verified Fri May 01 auth: no python
A lightweight OAI (Open Archives Initiative) client library for Python, designed to harvest metadata from OAI-PMH compliant repositories. Current version is 0.7.0, with maintenance releases as needed.
pip install sickle Common errors
error AttributeError: module 'sickle' has no attribute 'OAIResponse' ↓
cause Trying to use 'sickle.OAIResponse' directly, but it's not a top-level attribute unless explicitly imported.
fix
Use 'from sickle import OAIResponse' or access it as 'sickle.oaipmh.OAIResponse' (deprecated).
error requests.exceptions.SSLError: HTTPSConnectionPool(host='...', port=443): Max retries exceeded with url: ... (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed'))) ↓
cause The OAI endpoint uses a self-signed or invalid SSL certificate.
fix
If you trust the endpoint, pass 'verify=False' to Sickle constructor (not recommended in production). For production, use proper certificates.
Warnings
breaking In v0.7.0, the 'OAIResponse' class moved from 'sickle.oaipmh' to top-level 'sickle'. Old imports will break. ↓
fix Change imports from 'from sickle.oaipmh import OAIResponse' to 'from sickle import OAIResponse'.
gotcha The 'ListRecords' iterator internally handles resumption tokens automatically. However, if you manually iterate and break early, the underlying HTTP session may not be cleanly closed. Always use context managers or iterate fully. ↓
fix Wrap usage in a 'with Sickle(...) as app:' context manager to ensure cleanup.
deprecated The 'max_retries' parameter is deprecated in favor of 'retry_status_codes' and 'retry_backoff_factor' via requests adapter. ↓
fix Use 'from requests.adapters import HTTPAdapter' to configure retries.
Imports
- OAIResponse wrong
from sickle.oaipmh import OAIResponsecorrectfrom sickle import OAIResponse - Sickle
from sickle import Sickle
Quickstart
from sickle import Sickle
sickle = Sickle('https://api.example.com/oai', verify=True)
records = sickle.ListRecords(metadataPrefix='oai_dc')
for record in records:
print(record.header.identifier, record.metadata.get('title', ''))