Flexcache: Disk-based object caching
flexcache is a Python library (current version 0.3) designed to cache transformed versions of source objects to disk. It provides a robust and extensible framework for managing expensive calculations by storing their results persistently. The library allows for flexible cache invalidation strategies based on file modification time or content hash. It is currently in a pre-1.0 release, suggesting an active development cadence with potential for future changes.
Warnings
- gotcha The library uses Python's `pickle` protocol for serialization of cached objects. Deserializing data from an untrusted source using `pickle` can lead to arbitrary code execution. Ensure that cached files are only loaded from trusted sources or implement custom, secure serialization.
- gotcha Choosing the correct cache invalidation strategy (`DiskCacheByMTime` vs. `DiskCacheByHash`) is crucial. `DiskCacheByMTime` relies on file modification timestamps, which might not detect changes if a file's content is altered without updating its timestamp (e.g., atomic writes that replace the file).
- breaking As `flexcache` is in a pre-1.0 version (0.3), its API and internal behavior may not be fully stable. Future minor releases (e.g., 0.4, 0.5) might introduce breaking changes without strict adherence to semantic versioning until a 1.0 release.
Install
-
pip install flexcache
Imports
- DiskCacheByMTime
from flexcache import DiskCacheByMTime
- DiskCacheByHash
from flexcache import DiskCacheByHash
- DiskCache
from flexcache import DiskCache
Quickstart
import pathlib
from flexcache import DiskCacheByMTime
# Create a cache directory (ensure it exists)
cache_dir_path = pathlib.Path(os.environ.get('FLEXCACHE_CACHE_DIR', './my_flexcache_cache'))
cache_dir_path.mkdir(parents=True, exist_ok=True)
dc = DiskCacheByMTime(cache_folder=cache_dir_path)
def expensive_parser(source_path: pathlib.Path) -> str:
"""Simulates an expensive operation that reads and transforms a file."""
print(f"[Parser] Reading and processing: {source_path}")
return source_path.read_text().upper()
# Create a dummy source file
source_file_path = pathlib.Path("source.txt")
source_file_path.write_text("Hello, Flexcache World!")
# First call: `expensive_parser` will be executed and result cached
print("\n--- First call ---")
parsed_content_1 = dc.load(source_file_path, reader=expensive_parser)
print(f"Result: {parsed_content_1}")
# Second call: result will be loaded from cache without executing `expensive_parser`
print("\n--- Second call (from cache) ---")
parsed_content_2 = dc.load(source_file_path, reader=expensive_parser)
print(f"Result: {parsed_content_2}")
# Modify the source file to invalidate the cache (for DiskCacheByMTime)
source_file_path.write_text("UPDATED content for Flexcache!")
# Third call: `expensive_parser` will be re-executed due to source modification
print("\n--- Third call (cache invalidated) ---")
parsed_content_3 = dc.load(source_file_path, reader=expensive_parser)
print(f"Result: {parsed_content_3}")
# Clean up generated files and directory
source_file_path.unlink()
for item in cache_dir_path.iterdir():
if item.is_file():
item.unlink()
cache_dir_path.rmdir()