Daff: Data Diff and Patch Tables
daff (data diff) is a Python library for comparing tables, producing a summary of their differences, and applying such summaries as patch files. It is optimized for comparing tables that share a common origin, effectively tracking changes between versions of the 'same' table. The library is actively maintained with frequent updates.
Warnings
- breaking The keyword `async` became a reserved keyword in Python 3.7. If you were using `async` as a variable name, function parameter, or identifier in your code that interacts with `daff`, it will now cause a `SyntaxError` in Python 3.7 and newer.
- gotcha The optional `sqlite3` dependency was updated from version 3 to 4 in `daff` v1.3.37. While `sqlite3` is a built-in Python module, this update might subtly affect behavior or introduce incompatibilities if your application relies on specific, advanced features or versions of SQLite that `daff` interacts with, especially when using `--input-format sqlite`.
- gotcha Like many Python libraries, `daff`'s API might expose functions or methods where mutable objects (e.g., lists, dictionaries) are used as default arguments. Using mutable defaults can lead to unexpected side effects across multiple function calls, as the default object is created only once when the function is defined.
Install
-
pip install daff
Imports
- daff
import daff
Quickstart
import daff
data1 = [
['Country', 'Capital'],
['Ireland', 'Dublin'],
['France', 'Paris'],
['Spain', 'Barcelona']
]
data2 = [
['Country', 'Code', 'Capital'],
['Ireland', 'ie', 'Dublin'],
['France', 'fr', 'Paris'],
['Spain', 'es', 'Madrid'],
['Germany', 'de', 'Berlin']
]
# Create table views for the data
table1 = daff.PythonTableView(data1)
table2 = daff.PythonTableView(data2)
# Compare tables to get alignment information
alignment = daff.Coopy.compareTables(table1, table2).align()
# Prepare an empty table to hold the diff results
data_diff = []
table_diff = daff.PythonTableView(data_diff)
flags = daff.CompareFlags()
highlighter = daff.TableDiff(alignment, flags)
highlighter.hilite(table_diff)
# Render the diff to HTML for visualization
diff_render = daff.DiffRender()
diff_render.usePrettyArrows(False) # Optional: control arrow style
diff_render.render(table_diff)
table_diff_html = diff_render.html()
print("Original Table 1:", data1)
print("Modified Table 2:", data2)
print("\nHTML Diff:\n", table_diff_html)