unidiff
unidiff is a simple Python library for parsing and interacting with unified diff data. It allows developers to extract metadata, file changes, and hunk information from diffs. The current version is 0.7.5, and it maintains an active development status with periodic releases addressing bug fixes and feature enhancements.
Warnings
- breaking In unidiff v0.7.0, the way renamed files report their target path changed. `PatchedFile.path` now returns the *target* filename for renamed files. If your code previously relied on another attribute or inferred the target path differently, it might break.
- gotcha Older versions of unidiff (prior to v0.7.4-0.7.5) had issues parsing git diffs with spaces in filenames or those generated with `--no-prefix`. This could lead to incorrect parsing or errors.
- gotcha unidiff might throw `UnidiffParseError: Unexpected hunk found` if the diff input lacks proper filename headers (e.g., `--- a/path/to/file` and `+++ b/path/to/file`). While `difflib.unified_diff()` can produce diffs without these headers, unidiff expects them.
- gotcha Handling of binary files was improved in v0.6.0 and v0.7.5. Older versions might not correctly identify or process changes in binary files, or might error if a binary file is the first change in a patch.
Install
-
pip install unidiff
Imports
- PatchSet
from unidiff import PatchSet
Quickstart
import urllib.request
from unidiff import PatchSet
# Example: Parse a diff from a GitHub pull request URL
diff_url = 'https://github.com/matiasb/python-unidiff/pull/3.diff'
with urllib.request.urlopen(diff_url) as response:
diff_data = response.read()
encoding = response.headers.get_content_charset() or 'utf-8'
patch = PatchSet(diff_data.decode(encoding))
print(f"Parsed {len(patch)} files in the diff.")
for patched_file in patch:
print(f"File: {patched_file.path}")
print(f" Added: {patched_file.added} lines, Removed: {patched_file.removed} lines")
for hunk in patched_file:
for line in hunk:
if line.is_added:
print(f" + {line.value.strip()}")
elif line.is_removed:
print(f" - {line.value.strip()}")