{"id":24288,"library":"py-tlsh","title":"py-tlsh","description":"TLSH (Trend Micro Locality Sensitive Hashing) is a fuzzy hashing algorithm for similarity comparison of binary data. The py-tlsh package provides a C++ Python extension for computing and comparing TLSH hashes. Current version 4.12.1 (but a major v5.0.0 exists; see warnings). Release cadence is irregular.","status":"active","version":"4.12.1","language":"python","source_language":"en","source_url":"https://github.com/trendmicro/tlsh","tags":["fuzzy-hashing","locality-sensitive-hashing","similarity","trend-micro","malware"],"install":[{"cmd":"pip install py-tlsh","lang":"bash","label":"Install from PyPI"}],"dependencies":[],"imports":[{"note":"The module is imported as `tlsh`, not a submodule.","wrong":"from tlsh import tlsh","symbol":"tlsh","correct":"import tlsh"}],"quickstart":{"code":"import tlsh\n\n# Create a TLSH hash from a byte string\ndata = b\"hello world\"\nhash = tlsh.hash(data)\nprint(\"Hash:\", hash)\n\n# Compare two hashes (similarity score)\nhash2 = tlsh.hash(b\"hello world!\")\nscore = tlsh.diff(hash, hash2)\nprint(\"Difference score:\", score)\n\n# Check if a hash is valid\nprint(\"Is valid:\", tlsh.is_valid(hash))","lang":"python","description":"Basic usage: hash creation, comparison, and validation."},"warnings":[{"fix":"Use v4.12.1 (the latest PyPI release) unless you explicitly need v5 features. If using v5, be aware that all hashes will start with 'T1' and you may need to update storage/comparison logic.","message":"TLSH v5.0.0 changed the default digest prefix to 'T1'. The v4.x series does not include the 'T1' prefix. If you upgrade to v5.0.0 (not yet on PyPI as of this writing) your hashes will be incompatible with v4.x hashes and with other tools that expect the old format.","severity":"breaking","affected_versions":">=5.0.0"},{"fix":"If you need to ensure forward compatibility, consider stripping or handling the 'T1' prefix. For now, stick with v4.12.1 if you want to avoid breaking changes.","message":"The function `tlsh.hash()` returns a hex string. In v5.0.0 this may change to include the 'T1' prefix. The old behavior is deprecated and will be removed in a future major release.","severity":"deprecated","affected_versions":">=5.0.0"},{"fix":"Use `tlsh.diff(hash1, hash2)` and interpret 0 as identical. For a similarity metric, you can invert the score (e.g., similarity = max(0, 100 - diff)) but note that the maximum difference is not fixed at 100.","message":"The `tlsh.diff()` function returns a difference score: 0 means identical, higher values mean more different. This is the opposite of a similarity score (0-100 typical in other libraries). Common mistake: treat the score as a similarity percentage.","severity":"gotcha","affected_versions":"all"},{"fix":"Pass bytes: `tlsh.hash(b'hello')` or `tlsh.hash('hello'.encode('utf-8'))`.","message":"The `tlsh.hash()` function requires a bytes-like object, not a string. Passing a plain string will raise a TypeError.","severity":"gotcha","affected_versions":"all"},{"fix":"Install a C++ compiler (e.g., 'build-essential' on Ubuntu, Xcode on macOS, Visual Studio Build Tools on Windows). For Windows, consider using the unofficial Windows binary wheels from Christoph Gohlke or install via conda.","message":"The py-tlsh extension is compiled from C++. On some platforms (especially Windows and older Linux distros) installation may fail due to missing C++ compiler or headers. The PyPI wheel may not cover all platforms.","severity":"gotcha","affected_versions":"all"}],"env_vars":null,"last_verified":"2026-05-01T00:00:00.000Z","next_check":"2026-07-30T00:00:00.000Z","problems":[{"fix":"Encode the string to bytes: `tlsh.hash('hello'.encode('utf-8'))`","cause":"Passed a string to tlsh.hash() instead of bytes.","error":"TypeError: a bytes-like object is required, not 'str'"},{"fix":"Install via `pip install py-tlsh`. If that fails, ensure you have a C++ compiler (build-essential on Linux, Xcode command line tools on macOS).","cause":"py-tlsh is not installed or installed incorrectly.","error":"ImportError: No module named tlsh"},{"fix":"Ensure the hash is exactly 70 hex characters (v4.x) or starts with 'T1' and is 72 characters (v5.x). Use `tlsh.is_valid(hash)` to check.","cause":"Provided hash string is not a valid TLSH digest (wrong length or characters).","error":"ValueError: Invalid TLSH hash"},{"fix":"Reinstall py-tlsh via pip (it bundles the C extension). If using a custom build, set LD_LIBRARY_PATH or install the library system-wide.","cause":"Dynamic library not found. Usually occurs when building from source or using a non-standard installation.","error":"OSError: [Errno 2] No such file or directory: 'libtlsh.so'"}],"ecosystem":"pypi","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null}