Diff Match Patch

20241021 · active · verified Thu Apr 09

Google's Diff Match and Patch libraries offer robust algorithms for synchronizing plain text, including diffing two texts, finding fuzzy matches for a pattern, and applying patches. Originally developed for Google Docs, this Python package provides a modern, actively maintained wrapper around the core algorithms. It's suitable for comparing texts, showing differences, and applying changes.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates the core functionalities: computing differences between two texts, generating a patch from these differences, and then applying that patch to an original text. The `diff_cleanupSemantic` method is optionally used to improve readability of the diff output.

from diff_match_patch import diff_match_patch

# Initialize the diff_match_patch object
dmp = diff_match_patch()

text1 = "The quick brown fox jumps over the lazy dog."
text2 = "A quick black fox jumps over the active cat."

# 1. Compute a diff
diffs = dmp.diff_main(text1, text2)

# Optional: Clean up the diff for semantic readability
dmp.diff_cleanupSemantic(diffs)

print(f"Computed Diffs: {diffs}")
# Expected output example: [(-1, 'The'), (1, 'A'), (0, ' quick '), (-1, 'brown'), (1, 'black'), (0, ' fox jumps over the '), (-1, 'lazy dog'), (1, 'active cat'), (0, '.')] 

# 2. Generate a patch from the diffs
patches = dmp.patch_make(text1, text2, diffs)
patch_text = dmp.patch_toText(patches)
print(f"\nGenerated Patch: {patch_text}")

# 3. Apply the patch to an original text
# Let's simulate applying it to text1 to get text2
new_text, results = dmp.patch_apply(patches, text1)

print(f"\nApplied Patch (New Text): {new_text}")
print(f"Patch Application Results: {results}")

view raw JSON →