{"id":4345,"library":"csv-diff","title":"CSV Diff","description":"csv-diff is a Python CLI tool and library for efficiently comparing the semantic contents of two CSV, TSV, or JSON files. It identifies added, removed, and changed rows based on a specified key, ignoring cosmetic differences like row and column ordering. The library is actively maintained with regular updates addressing features and bug fixes, with its current version being 1.2.","status":"active","version":"1.2","language":"en","source_language":"en","source_url":"https://github.com/simonw/csv-diff","tags":["csv","diff","cli","json","data comparison","tabulate"],"install":[{"cmd":"pip install csv-diff","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Powers the command-line interface (CLI) interactions.","package":"click"},{"reason":"Likely used for flexible, key-based data access across different data structures (e.g., nested JSON objects).","package":"python-benedict"},{"reason":"Used for formatting and rendering tabular data in human-readable output.","package":"tabulate"},{"reason":"Provides rich text and beautiful formatting in the terminal output.","package":"rich"}],"imports":[{"note":"Used to load CSV/TSV/JSON data from a file-like object or path into a standardized dictionary format for comparison.","symbol":"load_csv","correct":"from csv_diff import load_csv, compare"},{"note":"The main function for performing the diff between two loaded data structures.","symbol":"compare","correct":"from csv_diff import load_csv, compare"}],"quickstart":{"code":"import io\nfrom csv_diff import load_csv, compare\n\n# Simulate two CSV files as in-memory strings\ncsv1_data = \"\"\"id,name,age\n1,Alice,30\n2,Bob,24\n3,Charlie,35\"\"\"\n\ncsv2_data = \"\"\"id,name,age\n1,Alice,31\n3,Charlie,35\n4,David,28\"\"\"\n\n# Load the CSV data, specifying the key column\ncsv1 = load_csv(io.StringIO(csv1_data), key=\"id\")\ncsv2 = load_csv(io.StringIO(csv2_data), key=\"id\")\n\n# Compare the two CSVs\ndiff = compare(csv1, csv2)\n\n# Print the detected differences\nprint(f\"Added rows: {diff.get('added')}\")\nprint(f\"Removed rows: {diff.get('removed')}\")\nprint(f\"Changed rows: {diff.get('changed')}\")\nprint(f\"Columns added: {diff.get('columns_added')}\")\nprint(f\"Columns removed: {diff.get('columns_removed')}\")","lang":"python","description":"This quickstart demonstrates how to use `csv-diff` programmatically to compare two in-memory CSV datasets. It loads the data using `load_csv`, specifying 'id' as the unique key, and then uses `compare` to generate a dictionary detailing added, removed, and changed rows and columns."},"warnings":[{"fix":"Always provide the `--key` option when using the CLI, or the `key` parameter to `load_csv` when using the library.","message":"Providing a `key_columns` (or `key` parameter for `load_csv`) is crucial. Without a specified unique key, `csv-diff` cannot accurately identify matching rows for comparison, potentially leading to incorrect diff results or errors. This was explicitly addressed with a fix in version 1.0.","severity":"gotcha","affected_versions":"< 1.0"},{"fix":"Ensure your `csv-diff` version is 1.0 or newer. If you relied on the old (buggy) behavior, review your diffs after updating.","message":"Prior to version 1.0, column names containing a `.` character could cause bugs. This was fixed in 1.0, potentially changing diff results for users who encountered this issue in earlier versions.","severity":"breaking","affected_versions":"< 1.0"},{"fix":"Use the `--format` CLI option or ensure your data loading logic explicitly handles the expected input format to avoid unexpected parsing behavior.","message":"While `csv-diff` automatically detects CSV/TSV/JSON formats, it's safer to explicitly specify the input format using `--format=csv`, `--format=tsv`, or `--format=json` for the CLI, or appropriate handling when loading data programmatically, especially for ambiguous files.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For programmatic consumption of diff results, use the dictionary output from the `compare` function rather than parsing the CLI's human-readable text output. If you must parse CLI output, ensure your parsing logic is robust to format changes or use `--json` output.","message":"The format of the human-readable CLI output changed significantly in versions 0.3.1 and 0.2 (e.g., order of output, inclusion of more detail). Scripts that parsed the CLI's plain text output in older versions might break or produce incorrect results with newer versions.","severity":"deprecated","affected_versions":"< 0.3.1"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}