Diff Parser

1.1 · active · verified Fri Apr 10

diff-parser is a Python package designed for parsing and representing diff files, specifically supporting git diff data or .diff file formats. It provides structured access to various properties for each changed file, including filenames, file paths, source-hashes, target-hashes, and line counts. Currently at version 1.1, the library appears actively maintained with recent updates, though it doesn't follow a strict release cadence, releasing on demand.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to parse a diff string using `diff-parser` by writing it to a temporary file and then creating a `Diff` object from its path. It then iterates through the parsed file blocks to extract key information such as filenames and line change counts.

import tempfile
import os
from diff_parser import Diff

# Create a dummy diff string representing file changes
diff_content = """diff --git a/old_file.py b/new_file.py
index 1234567..890abcd 100644
--- a/old_file.py
+++ b/new_file.py
@@ -1,3 +1,4 @@
# This is an old line
-print("Hello, old world!")
+print("Hello, new world!")
+print("Another new line")
# End of file
"""

# diff-parser expects a file path, so we write the diff content to a temporary file.
temp_dir = tempfile.mkdtemp()
diff_file_path = os.path.join(temp_dir, "example.diff")

try:
    with open(diff_file_path, "w") as f:
        f.write(diff_content)

    # Initialize the Diff parser with the path to the diff file
    diff = Diff(diff_file_path)

    # Iterate through each changed file block in the diff
    for block in diff:
        print(f"--- File Change ---")
        print(f"Old filename: {block.old_filename}")
        print(f"New filename: {block.new_filename}")
        print(f"New filepath: {block.new_filepath}")
        print(f"Lines added: {block.added_lines_count}")
        print(f"Lines removed: {block.removed_lines_count}")
        # Accessing individual hunks and lines is also possible:
        # for hunk in block.hunks:
        #     for line in hunk.lines:
        #         print(f"  Line type: {line.type}, Content: {line.value.strip()}")

finally:
    # Clean up the temporary file and directory
    os.remove(diff_file_path)
    os.rmdir(temp_dir)

view raw JSON →