YARA Python Interface
yara-python is the official Python interface for YARA, a pattern matching tool used by security researchers to identify and classify malware. It provides bindings to the YARA C library, allowing Python applications to compile and apply YARA rules. The library is actively maintained, with new versions (currently 4.5.4) typically released in conjunction with updates to the underlying YARA engine.
Warnings
- breaking The structure of the `yara.Match.strings` field changed in version 4.3.0. Previously, it was a list of tuples `(<offset>, <string identifier>, <string data>)`. It is now a list of `yara.StringMatch` objects, which in turn contain `yara.StringMatchInstance` objects for actual matches.
- gotcha On Linux and macOS, `pip install yara-python` may fail unless the YARA C library and its development headers are pre-installed via the system's package manager. This is because `yara-python` is a wrapper around the C library and often needs to compile against it if a pre-built wheel is not available for your specific platform/Python version.
- gotcha Versions 4.3.x had a memory leak and potential heap corruption issue related to incorrect reference counting when calling `yara.StringMatchInstance.plaintext()` without an XOR key. This was fixed in YARA-Python 4.4.0.
- gotcha There's a distinction between `yara.compile()` and `yara.load()`. `yara.compile()` processes YARA rule source code (from strings, files, or file paths). `yara.load()` is used to load *pre-compiled* YARA rule files (typically with a `.yarac` extension) that have been previously saved using `rules.save()`.
Install
-
pip install yara-python
Imports
- yara
import yara
Quickstart
import yara
# Compile a YARA rule from a string
rules = yara.compile(source='rule foo: bar { strings: $a = "lmn" condition: $a }')
# Scan some data
data_to_scan = b'abcdefgjiklmnoprstuvwxyz'
matches = rules.match(data=data_to_scan)
# Process matches
if matches:
for match in matches:
print(f"Rule: {match.rule}, Tags: {match.tags}")
# In YARA-Python 4.3.0+, match.strings is a list of yara.StringMatch objects
for s in match.strings:
print(f" String: {s.identifier} at offset {s.instances[0].offset} with data '{s.instances[0].matched_data.decode()}'")
else:
print("No matches found.")