libsast - Generic SAST Library
libsast is a Python library providing generic Static Application Security Testing (SAST) capabilities, built upon Semgrep and regex patterns. It allows users to define custom rules and scan codebases for security vulnerabilities. The library is actively maintained with frequent patch and minor releases, with the current version being 3.1.6.
Common errors
-
ModuleNotFoundError: No module named 'semgrep'
cause Attempting to use `libsast` with Semgrep-based rules without having the `semgrep` package installed.fixInstall `libsast` with the `semgrep` extra: `pip install libsast[semgrep]` or install `semgrep` separately: `pip install semgrep`. -
TypeError: Cannot pickle '_thread.RLock' object
cause This error or similar multiprocessing-related issues often occur when using the default `ProcessPoolExecutor` in environments like AWS Lambda or Celery, which might not correctly handle process forking or serialization.fixWhen initializing `Scan`, specify a different multiprocessing executor: `scanner = Scan(..., multiprocessing_executor='thread')` or `scanner = Scan(..., multiprocessing_executor='billiard')`. -
libsast.exceptions.InvalidRuleError: Rule 'MY_CUSTOM_RULE' is missing required patterns.
cause A custom `Rule` object was defined without a valid `patterns` list or with an incorrectly structured pattern within the list.fixEnsure the `patterns` argument in your `Rule` definition is a list of dictionaries, where each dictionary contains at least a `regex` or `semgrep` key. Example: `patterns=[{'regex': r'bad_pattern', 'confidence': 'high'}]`.
Warnings
- gotcha Multiprocessing (ProcessPoolExecutor) can cause issues in serverless environments (e.g., AWS Lambda) or task queues (e.g., Celery, Django-Q).
- gotcha The `semgrep` dependency is now optional. If you use Semgrep-based rules without installing `semgrep`, you will encounter runtime errors.
- breaking Internal APIs for `PatternMatcher` and `ChoiceMatcher` were updated.
- gotcha libsast requires Python 3.8 or newer, but is not compatible with Python 4.0 or higher.
Install
-
pip install libsast -
pip install libsast[semgrep]
Imports
- Scan
from libsast.core.scan import Scan
- Rule
from libsast.core.rule import Rule
Quickstart
import os
import tempfile
import shutil
from libsast.core.scan import Scan
from libsast.core.rule import Rule
# Create a dummy directory and file for scanning
temp_dir = tempfile.mkdtemp()
temp_file_path = os.path.join(temp_dir, "test_code.py")
try:
with open(temp_file_path, "w") as f:
f.write("password = 'mysecretpassword'\n")
f.write("API_KEY = 'YOUR_API_KEY_HERE'\n")
f.write("def my_func():\n print('Hello')\n")
# Define a simple regex rule
my_rule = Rule(
rule_id="HARDCODED_SECRET",
description="Detects hardcoded sensitive keywords like 'password' or 'API_KEY'",
severity="high",
patterns=[
{"regex": r"(password\s*=|API_KEY\s*=)", "confidence": "high", "message": "Hardcoded secret found."}
],
metadata={
"cwe": "CWE-798",
"owasp": "A07:2021-Identification and Authentication Failures"
}
)
# Initialize the scanner
# In environments like AWS Lambda or Celery, consider `multiprocessing_executor="thread"` or "billiard"
scanner = Scan(
target=temp_dir, # Scan the entire directory
rules=[my_rule],
enable_default_rules=False, # Set to True to include libsast's built-in rules
multiprocessing_executor="process" # Options: "process", "thread", "billiard"
)
# Run the scan
results = scanner.run()
# Process results
if results:
print(f"Found {len(results)} vulnerabilities:")
for result in results:
print(f"- Rule ID: {result.rule_id}")
print(f" File: {result.file_path}")
print(f" Line: {result.line_number}")
print(f" Match: '{result.match_string}'")
if result.message:
print(f" Message: {result.message}")
else:
print("No vulnerabilities found.")
finally:
# Clean up the temporary directory
shutil.rmtree(temp_dir)