Maco Extractor
Maco Extractor is a Python package providing the essential framework for creating and running malware configuration extractors. It aims to standardize the output (using the Maco Model) and provide a consistent way to identify and execute parsers. The library is actively maintained, with frequent releases addressing compatibility, bug fixes, and new features.
Warnings
- breaking The `import_extractors` utility had backwards compatibility issues prior to `v1.2.23`. If you are using custom extractor loading mechanisms or older versions, ensure compatibility.
- gotcha When running extractors on Python 3.8/3.9 with versions prior to `v1.2.25`, there might have been issues related to mutable default arguments in `model.py` that were fixed using `default_factory`. This could lead to unexpected behavior if not handled correctly in older versions.
- gotcha An `UnboundLocalError` could occur in `run_extractor` when YARA was explicitly disabled in versions prior to `v1.2.22`. This can cause crashes during runtime if YARA rule processing is not intended or configured.
- gotcha The project introduced a separate `maco-model` package in `v1.2.18` containing only the model definition. If you only need the data model, install `maco-model`. If you intend to write and run extractors, install `maco-extractor` which includes the full framework.
- gotcha When writing YARA rules for extractors, the YARA rule names must be prefixed with the extractor class name to ensure proper association and triggering. Failing to do so may result in rules not being recognized or applied correctly.
- gotcha As of `v1.2.24`, `maco-extractor` now explicitly shows `yara-x` warnings from rules within extractors. While this provides more diagnostic information, it can lead to increased output if your YARA rules have warnings. Review your YARA rules to eliminate unnecessary warnings.
Install
-
pip install maco-extractor
Imports
- ExtractorModel
from maco.model import ExtractorModel
- Extractor
from maco.extractor import Extractor
- run_extractor
from maco.collector import run_extractor
Quickstart
import os
from maco.model import ExtractorModel
from maco.extractor import Extractor
from maco.collector import run_extractor
# Define a simple Maco Extractor
class MySimpleExtractor(Extractor):
# Yara rules can be defined here as a bytes object
# rules = b'rule my_rule { strings: $a = "test_data" condition: $a }'
def run(self, sample: bytes, **kwargs) -> ExtractorModel:
# Example: if a specific string is found, set a property in the model
if b"hello maco" in sample:
model = ExtractorModel(family="GreetingMalware")
model.add_tag("found_greeting")
model.add_string(value="hello maco", context="sample_content")
return model
# All extractors must return an ExtractorModel, even if no config is found
return ExtractorModel(family="Unknown")
# Create a dummy file for the extractor to process
sample_content = b"This is some test_data with hello maco inside."
sample_path = "test_sample.bin"
with open(sample_path, "wb") as f:
f.write(sample_content)
try:
# Run the extractor against the sample file
# 'extractors' expects a list of Extractor classes
results = run_extractor(extractors=[MySimpleExtractor], sample_path=sample_path)
# Print the results
print(f"Extractor results for {sample_path}:")
for result in results:
print(f" Family: {result.family}")
print(f" Tags: {result.tags}")
print(f" Strings: {[s.value for s in result.strings]}")
except Exception as e:
print(f"An error occurred: {e}")
finally:
# Clean up the dummy file
if os.path.exists(sample_path):
os.remove(sample_path)