TruffleHog (Python Library)
TruffleHog is an older Python library, version 2.2.1, designed to scan git repositories for sensitive information like high entropy strings and secrets by analyzing commit history. It was last released on PyPI in 2017 (with a re-upload of the same version in 2021) and is largely unmaintained, primarily supporting Python 2 environments. The project's active development shifted to a separate Go-based implementation (TruffleHog v3.x by Truffle Security), which is not this Python library.
Common errors
-
ModuleNotFoundError: No module named 'truffleHog'
cause The `trufflehog` Python package is not installed in the current environment or the Python interpreter cannot locate it.fixInstall the package using pip: `pip install trufflehog`. Ensure your Python environment's PATH is correctly configured if running scripts directly. -
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position X: ordinal not in range(128)
cause This error typically occurs in Python 2 environments when `truffleHog` encounters non-ASCII characters in git commit messages, file paths, or content, due to Python 2's default ASCII encoding.fixEnsure your system's locale is set to UTF-8 (e.g., `export LC_ALL=en_US.UTF-8`). For robust handling of diverse character sets, consider using a more modern Python version or the Go-based TruffleHog. -
Error: No such file or directory: 'git'
cause TruffleHog (via `GitPython`) relies on the `git` command-line executable being installed and accessible in the system's PATH.fixInstall Git on your operating system and ensure its executable path is included in your system's environment variables. (e.g., `sudo apt-get install git` on Debian/Ubuntu, `brew install git` on macOS).
Warnings
- breaking This Python library version (2.2.1) is primarily designed for Python 2, which reached End-of-Life in 2020. While it has some Python 3 compatibility, users may encounter `UnicodeDecodeError` or other compatibility issues in modern Python 3 environments.
- gotcha This PyPI package `trufflehog` (version 2.2.1, `dxa4481/truffleHog`) is an older, distinct, and largely unmaintained Python library. It should not be confused with the actively developed, Go-based `TruffleHog` (v3.x by `trufflesecurity/trufflehog`), which offers significantly more features, better performance, and ongoing updates.
- gotcha Due to its age and lack of updates, the Python `truffleHog` library has limited functionality compared to the modern Go version. It lacks many current detectors for various secret types, active verification capabilities, and integrations with cloud services or CI/CD pipelines.
Install
-
pip install trufflehog
Imports
- truffleHog
from truffleHog import truffleHog
Quickstart
import os
from truffleHog import truffleHog
# NOTE: This Python library (v2.2.1) is largely unmaintained.
# For active development and modern features, consider the Go-based TruffleHog CLI.
# This quickstart demonstrates the API for the Python 2.2.1 version.
# Replace with a valid local git repository path or URL
# For demonstration, we'll use a dummy path. TruffleHog needs a real git repo.
# In a real scenario, you'd clone a repo or use an existing one, e.g.,
# repo_path = 'https://github.com/some/repo.git'
repo_path = os.environ.get('TRUFFLEHOG_REPO_PATH', '/tmp/trufflehog_test_repo')
if not os.path.exists(repo_path) or not os.path.isdir(os.path.join(repo_path, '.git')):
print(f"Warning: '{repo_path}' is not a valid git repository. Output may be empty.")
print("Please provide a path to a cloned git repository or a git URL.")
# Attempt to create a dummy directory to avoid immediate FileNotFoundError
os.makedirs(repo_path, exist_ok=True)
# A real repo would be cloned like:
# import git
# git.Repo.clone_from('https://github.com/dxa4481/truffleHog.git', repo_path)
print(f"Scanning repository: {repo_path}")
# The main `find_strings` function initiates the scan.
# Parameters like `do_print_json`, `entropy_checks_enabled`, `regex_checks_enabled`
# control the scanning behavior. Many other options exist.
secrets = truffleHog.find_strings(
repo_path=repo_path,
do_print_json=False, # Set to True to print JSON output to stdout
entropy_checks_enabled=True,
regex_checks_enabled=True,
max_depth=1000000, # Scan all history by default
commit_max_depth=1000000,
since_commit=None,
delta=0,
max_filesize=100000 # Max file size to check in bytes
)
if secrets:
print("\nFound potential secrets:")
for secret in secrets:
# The `secrets` object is a list of dictionaries with scan results
print(secret)
else:
print("\nNo secrets found (or scanner failed to run without a proper git repo).")