{"id":5591,"library":"checksumdir","title":"Checksumdir: Directory Hashing Utility","description":"Checksumdir is a Python library designed to compute a single cryptographic hash for the contents of a given directory. It primarily focuses on the file contents, ignoring metadata by default. The current version is 1.3.0 and it maintains a sporadic or infrequent release cadence, with the latest PyPI release dated August 15, 2025.","status":"active","version":"1.3.0","language":"en","source_language":"en","source_url":"http://github.com/cakepietoast/checksumdir","tags":["hash","checksum","md5","directory","utility"],"install":[{"cmd":"pip install checksumdir","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Required Python version for compatibility.","package":"python","optional":false}],"imports":[{"symbol":"dirhash","correct":"from checksumdir import dirhash"}],"quickstart":{"code":"import checksumdir\nimport os\nimport tempfile\n\n# Create a temporary directory and some files for demonstration\nwith tempfile.TemporaryDirectory() as tmpdir:\n    print(f\"Created temporary directory: {tmpdir}\")\n    os.makedirs(os.path.join(tmpdir, 'subdir'), exist_ok=True)\n    \n    with open(os.path.join(tmpdir, 'file1.txt'), 'w') as f:\n        f.write('content one')\n    with open(os.path.join(tmpdir, 'subdir', 'file2.txt'), 'w') as f:\n        f.write('content two')\n\n    # Calculate MD5 hash of the directory contents (default ignores filenames/paths)\n    md5_hash = checksumdir.dirhash(tmpdir, 'md5')\n    print(f\"MD5 hash (content only): {md5_hash}\")\n\n    # Calculate SHA1 hash, including filenames in the hash calculation\n    sha1_hash_with_names = checksumdir.dirhash(tmpdir, 'sha1', hash_filename=True)\n    print(f\"SHA1 hash (including filenames): {sha1_hash_with_names}\")\n\n    # Example with exclusion\n    with open(os.path.join(tmpdir, 'temp_log.log'), 'w') as f:\n        f.write('temporary log content')\n\n    # Calculate MD5, excluding files ending with .log\n    md5_hash_excluded = checksumdir.dirhash(tmpdir, 'md5', excluded_extensions=['.log'])\n    print(f\"MD5 hash (excluding .log files): {md5_hash_excluded}\")\n\n# The temporary directory is automatically cleaned up here","lang":"python","description":"This example demonstrates how to compute MD5 and SHA1 hashes for a directory. It shows the default behavior of hashing only file contents and how to include filenames in the hash calculation. It also includes an example of excluding files by extension. The example uses a temporary directory for a self-contained demonstration."},"warnings":[{"fix":"To include filenames or file paths in the hash calculation, use the `hash_filename=True` or `hash_filepath=True` arguments, introduced in version 1.2.0.","message":"By default, `dirhash` computes a hash based *only* on the contents of the files within a directory, ignoring file names, directory structure, and metadata like timestamps. This means renaming a file or moving it to a subdirectory will not change the hash if its content remains the same.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure the directory exists before attempting to hash it, e.g., by checking `os.path.isdir(directory_path)`.","message":"Calling `checksumdir.dirhash()` with a path to a directory that does not exist will raise a `FileNotFoundError`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For extremely large datasets, consider hashing only critical subsets, or implement caching mechanisms for hashes that change infrequently. The `dirhash` library (a similar tool) supports multiprocessing for speedup if applicable.","message":"Hashing very large directories or directories with a vast number of small files can be I/O and CPU intensive, potentially leading to slow performance. Consider the scale of directories being hashed in performance-critical applications.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Upgrade your Python interpreter to version 3.9 or newer, or pin `checksumdir` to a compatible earlier version (e.g., `pip install 'checksumdir<1.3.0'`).","message":"Checksumdir version 1.3.0 and later explicitly requires Python 3.9 or newer. Users on older Python versions (e.g., Python 3.8 or earlier) must use an older version of the library or upgrade their Python environment.","severity":"breaking","affected_versions":">=1.3.0"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}