CacheControl
CacheControl provides an HTTP caching layer for the popular `requests` library, mimicking the caching algorithms found in `httplib2`. It aims to make `requests` sessions thread-safe and efficient by persisting HTTP responses according to cache-control headers. The library is actively maintained, with frequent updates addressing Python version compatibility, bug fixes, and serialization improvements.
Warnings
- breaking Python 3.8 support was dropped in v0.14.3. Python versions older than 3.10 are no longer officially supported as of v0.14.4. Ensure your environment meets the `>=3.10` requirement.
- breaking Serialization format changes: Version `0.13.1` removed support for older serialization formats (v1 and v2). Caches created with very old versions of `cachecontrol` (before `msgpack` was introduced around v0.12.0) will be unreadable after upgrading.
- gotcha The `msgpack` dependency has a version constraint (`<2.0.0`) since `v0.14.0`. If other libraries in your project require `msgpack >= 2.0.0`, you might encounter dependency conflicts.
- gotcha Older versions of `cachecontrol` (pre-`v0.12.13`/`v0.13.0`) might have compatibility issues with `requests` sessions using `urllib3 2.0+`, leading to `IncompleteRead` errors.
- gotcha A race condition when overwriting cache entries was fixed in `v0.14.2`. Concurrent writes to the same cache file could lead to corruption in earlier versions.
- gotcha Memory usage with `DictCache` or older `FileCache` implementations can be excessive for large binary responses. `SeparateBodyFileCache` was introduced for better memory efficiency by streaming large bodies.
Install
-
pip install cachecontrol -
pip install cachecontrol[filecache]
Imports
- CacheControl
from cachecontrol import CacheControl
- FileCache
from cachecontrol.caches.file_cache import FileCache
- CacheControlAdapter
from cachecontrol.adapter import CacheControlAdapter
Quickstart
import requests
from cachecontrol import CacheControl
from cachecontrol.caches.file_cache import FileCache
# Create a standard requests session
sess = requests.Session()
# Wrap the session with CacheControl using a FileCache for persistent storage
# Replace '.web_cache' with your desired cache directory
cached_sess = CacheControl(sess, cache=FileCache('.web_cache'))
# Make a request - the response will be cached if HTTP headers allow
response = cached_sess.get('https://httpbin.org/cache/60')
print(f"First request status: {response.status_code}")
print(f"From cache (should be False): {getattr(response, 'from_cache', False)}")
# Make the same request again - it should now be served from cache
response = cached_sess.get('https://httpbin.org/cache/60')
print(f"Second request status: {response.status_code}")
print(f"From cache (should be True): {getattr(response, 'from_cache', False)}")
# Clean up the cache directory (optional for a real app)
# import shutil
# shutil.rmtree('.web_cache', ignore_errors=True)