{"id":5644,"library":"jupyter-cache","title":"Jupyter Cache","description":"Jupyter Cache provides a defined interface for working with a cache of Jupyter notebooks. It enables execution and caching of notebooks, intelligently re-executing them only when code cells or related metadata have changed, rather than for every minor edit. The library offers both a Command-Line Interface (CLI) and a Python API for managing project notebooks, executing them, and retrieving detailed execution reports including timing statistics and exception tracebacks. It is utilized by projects like Jupyter Book to accelerate document builds by preventing unnecessary re-execution of unchanged notebook content. The current version is 1.0.1, with a release cadence driven by feature enhancements and dependency updates.","status":"active","version":"1.0.1","language":"en","source_language":"en","source_url":"https://github.com/executablebooks/jupyter-cache","tags":["jupyter","notebooks","cache","execution","cli","data-science","documentation"],"install":[{"cmd":"pip install jupyter-cache","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Core dependency for notebook execution, frequently updated.","package":"nbclient","optional":false},{"reason":"Used for database operations to manage the cache; compatibility fixes have been released.","package":"SQLAlchemy","optional":false},{"reason":"Used for working with notebook files programmatically.","package":"nbformat","optional":false}],"imports":[{"note":"Primary function to initialize the cache.","symbol":"get_cache","correct":"from jupyter_cache import get_cache"},{"note":"Used for advanced caching operations.","symbol":"CacheBundleIn","correct":"from jupyter_cache.base import CacheBundleIn"},{"note":"Used to load different notebook execution strategies.","symbol":"load_executor","correct":"from jupyter_cache.executors import load_executor"}],"quickstart":{"code":"import os\nimport pathlib\nimport nbformat as nbf\nfrom jupyter_cache import get_cache\n\n# Define cache path (can be set via JUPYTERCACHE env var too)\ncache_path = pathlib.Path('./.my_notebook_cache')\n\n# Create a dummy notebook file\nnb_content = nbf.v4.new_notebook()\nnb_content.cells.append(nbf.v4.new_code_cell(\"a = 1\\nb = 2\\nprint(a + b)\"))\nnotebook_path = pathlib.Path('./example.ipynb')\nwith open(notebook_path, 'w', encoding='utf8') as f:\n    nbf.write(nb_content, f)\n\ntry:\n    # Initialize the cache\n    cache = get_cache(cache_path)\n    print(f\"Cache initialized at: {cache.path}\")\n\n    # Clear cache for a clean start (optional)\n    cache.clear_cache()\n\n    # Add the notebook to the project\n    # Note: 'notebook' is the current API, 'stage' was used in older versions\n    cache.add_notebook_to_project(notebook_path)\n    print(f\"Notebook '{notebook_path.name}' added to project.\")\n\n    # Execute the notebooks in the project\n    # 'local-serial' is one of the default executors\n    cache.execute_project_notebooks(executor_name='local-serial')\n    print(\"Notebooks executed.\")\n\n    # List project records to see status\n    print(\"\\nProject Records:\")\n    for record in cache.list_project_records():\n        print(f\"  ID: {record.pk}, URI: {record.uri}, Status: {record.status}\")\n\n    # Retrieve a merged notebook with outputs\n    record_pk = cache.list_project_records()[0].pk\n    merged_nb = cache.get_executed_notebook(record_pk)\n    print(f\"\\nRetrieved executed notebook for PK {record_pk}, cells: {len(merged_nb.cells)}\")\n    # Clean up the generated notebook file\n    notebook_path.unlink(missing_ok=True)\n    # The cache directory can be cleared or deleted manually if needed\n    # cache.clear_cache()\nexcept Exception as e:\n    print(f\"An error occurred: {e}\")\nfinally:\n    # Clean up the dummy notebook file if an error occurred before unlinking\n    notebook_path.unlink(missing_ok=True)\n    # Consider adding cache_path.rmdir() or shutil.rmtree(cache_path) for full cleanup in tests\n    # but be careful with production environments.\n\n","lang":"python","description":"This quickstart demonstrates how to programmatically initialize a Jupyter Cache, add a notebook to a project, execute it using a local serial executor, and retrieve the executed notebook with its outputs. It first creates a dummy notebook file, then uses the `jupyter_cache` API to manage its lifecycle within the cache."},"warnings":[{"fix":"Upgrade Python to 3.8 or newer. The current minimum required Python version is >=3.9.","message":"Python 3.7 support was dropped in v0.6.0. Projects using Python 3.7 must upgrade their Python version before updating to jupyter-cache v0.6.0 or later.","severity":"breaking","affected_versions":">=0.6.0"},{"fix":"Consult the official documentation for the updated CLI commands and Python API methods, specifically looking for 'notebook' and 'project' related functions instead of 'stage' or 'staging'.","message":"A significant API/CLI re-write occurred in v0.5.0. Commands and Python API calls related to 'staging' notebooks were rephrased to 'notebook' or 'project'. For instance, 'stage add' became 'notebook add', and the Python API methods changed accordingly.","severity":"breaking","affected_versions":">=0.5.0"},{"fix":"Ensure that your notebook's execution environment is stable, all dependencies are pinned, and any non-deterministic operations are controlled (e.g., seeding random number generators). Avoid external dependencies that might change between executions without explicit cache invalidation.","message":"For jupyter-cache to be effective, notebooks must exhibit deterministic execution outputs. This means they should run in a consistent environment, avoid non-deterministic code (e.g., random number generation without seeding), and not rely on external, changing resources.","severity":"gotcha","affected_versions":"all"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}