bsdiff4 Library

1.2.6 · active · verified Wed Apr 15

bsdiff4 is a Python library providing functions for binary diff and patch operations, based on the `bsdiff4` algorithm. It allows computing the difference between two byte sequences or files, and then reconstructing the new sequence/file from the old one and the diff. As of version 1.2.6, it provides a stable interface for managing binary data changes.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `bsdiff4` to generate and apply binary differences for both `bytes` objects in memory and actual files on disk. It shows how to compute a `diff` from old and new data, and then apply that `diff` to the old data to reconstruct the new data.

import bsdiff4
import os

# Example with bytes
old_data = b"This is the old data string."
new_data = b"This is the new and updated data string."

# Generate a binary diff
diff_data = bsdiff4.diff(old_data, new_data)
print(f"Diff data length: {len(diff_data)} bytes")

# Apply the patch to get back the new data
patched_data = bsdiff4.patch(old_data, diff_data)
assert patched_data == new_data
print(f"Patched data (bytes): {patched_data.decode('utf-8')}\n")

# Example with files
file_old = 'old_file.txt'
file_new = 'new_file.txt'
file_diff = 'diff.bin'
file_patched = 'patched_file.txt'

with open(file_old, 'wb') as f:
    f.write(old_data)
with open(file_new, 'wb') as f:
    f.write(new_data)

# Generate diff between files
bsdiff4.file_diff(file_old, file_new, file_diff)
print(f"File diff created: {file_diff}")

# Apply patch to recreate new file
bsdiff4.file_patch(file_old, file_patched, file_diff)

with open(file_patched, 'rb') as f:
    recreated_data = f.read()
assert recreated_data == new_data
print(f"Patched file created: {file_patched} (content matches new_file.txt)")

# Clean up generated files
os.remove(file_old)
os.remove(file_new)
os.remove(file_diff)
os.remove(file_patched)

view raw JSON →