nbstripout
nbstripout is a Python utility that strips outputs, metadata, and other extraneous content from Jupyter and IPython notebooks. It's most commonly used as a Git hook to prevent large, noisy diffs and ensure clean notebooks in version control. The current version is 0.9.1, and it maintains an active release cadence with several updates per year.
Warnings
- breaking Python 3.8 and 3.9 support was dropped in version 0.9.0. If you are using an older Python version, you must upgrade to Python 3.10+ or pin nbstripout to a version prior to 0.9.0.
- breaking The command-line option `--strip-empty-cells` was renamed to `--drop-empty-cells` in version 0.6.0. Scripts using the old flag will fail.
- gotcha Starting from version 0.7.0, cell IDs are renamed to be sequential by default. If you rely on stable, persistent cell IDs for specific workflows (e.g., custom tooling expecting specific IDs), this change might break them.
- gotcha Since version 0.8.1, when `nbstripout --install` is used, the Git filter is declared as `required`. This means if the stripping process fails (e.g., due to a malformed notebook), the commit will be blocked. Previously, a failed strip might have been ignored.
- gotcha When using `nbstripout` in CI/CD pipelines to verify notebooks, the `--verify` flag (introduced in 0.8.0) is crucial. Without it, `nbstripout` might not return a non-zero exit code upon detecting changes, leading to silent failures.
- gotcha Line ending normalization (CRLF vs LF) can cause issues, especially across different operating systems. Version 0.8.2 improved preservation of Windows CRLF, and 0.9.1 introduced `--unix-newline` to force LF endings.
Install
-
pip install nbstripout
Imports
- strip_output (programmatic API)
from nbstripout import strip_output
Quickstart
# 1. Install nbstripout
pip install nbstripout
# 2. Install the git hook in your repository
# This ensures outputs are stripped automatically before committing.
# Navigate to your git repository first.
# os.system('nbstripout --install') # Uncomment to run, but be aware it modifies your .git/config
# 3. Example of programmatic usage (optional)
from nbstripout import strip_output
import json
# Simulate reading a notebook file
example_notebook_content = {
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": ["Hello, nbstripout!\n"]
}
],
"source": "print('Hello, nbstripout!')"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# Strip outputs from the notebook content
stripped_notebook_content = strip_output(json.dumps(example_notebook_content))
# Print the stripped content (outputs should be empty)
print("\n--- Original Notebook ---")
print(json.dumps(example_notebook_content, indent=2))
print("\n--- Stripped Notebook ---")
print(json.dumps(json.loads(stripped_notebook_content), indent=2))